Could somebody show me how to use Bulk Loader to import a CSV into Virtuoso as RDF

candlecao commented 5 months ago

I asked similar question to Chat GPT 4. It gave me guide as:

1. Prepare the CSV file in a path that can be accessed by Virtuoso

For example, the snippets are (1 header and 3 rows of entries):

Chant_ID,incipit,genre,src_link
562633,Ecce nunc palam loqueris et,http://www.wikidata.org/entity/Q582093,https://cantusdatabase.org/source/123756
671551,Dominus tamquam ovis ad victimam,http://www.wikidata.org/entity/Q582093,https://cantusdatabase.org/source/669163
562160,Seniores populi consilium fecerunt ut,http://www.wikidata.org/entity/Q604748,https://cantusdatabase.org/source/123756
...

--this file is put in a folder called "my_virdb" (where the Virtuoso server process was started; some files like virtuoso.ini, virtuoso.db are put there)

2. Prepare a text file with suffix `.ld`

ld_dir ('.', 'sampleData.csv', 'http://example.com/mygraph/UseBulkLoaderForCSV');
DB.DBA.TTLP_MT (file_to_string_output ('./mappings.ttl'), '', 'http://example.com/mygraph/UseBulkLoaderForCSV', 2);
rdf_loader_run();
checkpoint;

--Given that the CSV file has the name "sampleData.csv" and put in the "my_virdb" folder and (I supposed) '.' means the sampleData.csv is put in the same directory with sampleData.csv.
--This associates the mapping from CSV to URI (please see as below)

3. Define `mappings.ttl` that appeared in the `.ld` file as above

The content in the file, e.g.:

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix chant: <http://example.com/ontology/chant/> .
@prefix data: <http://example.com/data/> .

<http://example.com/mygraph/UseBulkLoaderForCSV> a rdfs:Graph ;
    rdfs:label "Chant data graph" .

chant:Chant_ID a rdfs:Property ;
    rdfs:label "Chant ID" ;
    rdfs:range xsd:string .

chant:incipit a rdfs:Property ;
    rdfs:label "Incipit" ;
    rdfs:range xsd:string .

chant:genre a rdfs:Property ;
    rdfs:label "Genre" ;
    rdfs:range xsd:URL .

chant:src_link a rdfs:Property ;
    rdfs:label "Source Link" ;
    rdfs:range xsd:URL .

--Place the mappings.ttl file in the same directory as your sampleData.csv.

4.4 Execute the scripts from terminal

isql -U dba -P mysecret -S 1111 /path/to/my_virtdb/LoaderScript.ld

All above seemed executed smoothly, however, after execution of procedures as above, then I executed select * from "DB.DBA.LOAD_LIST"; from ISQL shell or executed SPARQL SELECT * FROM <http://example.com/mygraph/UseBulkLoaderForCSV> WHERE { ?s ?p ?o } LIMIT 10;, there was no actual data found!

What's wrong? And could somebody point the right way?

candlecao commented 5 months ago

Supplemental: LoaderScript.ld is the name of the text file with suffix .ld

HughWilliams commented 4 months ago

That Chat-GPT generated output is incorrect (a hallucination), as the Virtuoso RDF Bulk Loader is used for loading datasets files in the RDF formats indicated as supported.

To bulk load multiple CSV files, Virtuoso has an equivalent CSV Bulk Loader process, as detailed in the linked documentation.

kidehen commented 4 months ago

Note: Our Virtuoso Personal Assistant, an OpenLink Personal Assistant (OPAL) module, will provide you with better guidance about CSV bulk loading into Virtuoso. Just perform the following steps:

Go to https://virtuoso.openlinksw.com and wait for the chatbot window to pop up
Pose your question, e.g., I am stuck trying to load CSV into Virtuoso

You should get a different response from this OPAL enclosed (and guardrailed) variant of ChatGPT, scoped to the Virtuoso Support Assistant module and its underlying Knowledge Graph.