Closed VladimirAlexiev closed 4 years ago
Using a single logicalSource
like this doesn't help:
<customer!source> a rml:BaseSource;
rr:tableName "_final.customer".
<person/(customer_id)!map>
rml:logicalSource <customer!source>.
The abort after 8 email triples is caused by #25 so I have hope that it will process all triples. But still processing all per row will be faster.
Hopefully, by solving issue #25 this problem is solved. Thank you for the suggestion regarding the execution per row. The RDFizer works under the assumption that each individual triples map has a different logical source. What you are recommending could be useful if we work under the assumption that each individual triples map has the same source.
@eiglesias34 I think you don't need to assume:
In either case, you can process such sources once. Here's some pseudo-code:
tmg
)
tmg
tm
) in the current tmg
tm
and the current rowCheers!
Thank you very much for the suggestion. I will take this into consideration for the following release of the RDFizer.
Seems that issue is solved, closing...
I have a moderate table of 1.344M rows and 23 fields. The fields are mapped to 15 nodes and 33 triples. The mapping looks like this:
The mapping is generated from a semantic model using my tool
rdf2rml
, that's why I don't use a single rml:logicalSource but several in blank nodes.Your tool makes nearly all triples from
birth!map
, then 8 triples fromemail!map
and then quits. This is on Postgres (I'll run this again to check).Even if it "rewound" the database (reran the
select *
query) to process all maps in sequence, that's considerably slower than iterating each row once and processing all maps on that row (which would cause all triples for one customer to be emitted together).I'll try replacing the blank nodes with a single rml:logicalSource and see if that helps.