AtomGraph / LinkedDataHub

The low-code Knowledge Graph application platform. Apache license.
Apache License 2.0
483 stars 120 forks source link

Could not import large RDF file. #150

Closed FNakano closed 1 year ago

FNakano commented 1 year ago

Hello! I converted a CSV file using atomgraph CSV2RDF. It generated a 193M (.ttl) file. ... then I tried to import it following steps 1-4 of Clicking on the save button (step 5) popped an alert with null written in it.

Captura de tela de 2023-01-29 20-28-24

Some messages are written in the terminal running LinkedDataHub:

linkeddatahub_1     | 00:52:57,489 [http-nio-7070-exec-1] DEBUG ModelXSLTWriterBase:252 - RDF/XML bytes written: 1124
nginx_1             | - - [29/Jan/2023:23:52:57 +0000] "GET /files/? HTTP/1.1" 200 47126 "https://localhost:4443/files/?" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/109.0"
nginx_1             | 2023/01/29 23:53:28 [error] 10#10: *423 client intended to send too large body: 193566581 bytes, client:, server: localhost, request: "POST /service? HTTP/1.1", host: "localhost:4443", referrer: "https://localhost:4443/files/?"
nginx_1             | - - [29/Jan/2023:23:53:28 +0000] "POST /service? HTTP/1.1" 413 183 "https://localhost:4443/files/?" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/109.0"
namedgraph commented 1 year ago

Hi. There are configured limits for the request body size:

193M triples is a lot of data. I would try to import it directly into the triplestore instead.

FNakano commented 1 year ago

One more piece of information...

Got the same message while trying to import a CSV file:

linkeddatahub_1     | 15:49:19,539 [http-nio-7070-exec-4] DEBUG BasedModelProvider:81 - RDF language used to read Model: Lang:RDF/XML
nginx_1             | - - [30/Jan/2023:14:49:19 +0000] "GET /sparql? HTTP/1.1" 200 2893 "https://localhost:4443/84f2dfb7-3c89-4405-abec-750e99c3e9c2/?" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/109.0"
nginx_1             | 2023/01/30 14:49:34 [error] 10#10: *124 client intended to send too large body: 10374282 bytes, client:, server: localhost, request: "POST /service? HTTP/1.1", host: "localhost:4443", referrer: "https://localhost:4443/84f2dfb7-3c89-4405-abec-750e99c3e9c2/?"
nginx_1             | - - [30/Jan/2023:14:49:34 +0000] "POST /service? HTTP/1.1" 413 183 "https://localhost:4443/84f2dfb7-3c89-4405-abec-750e99c3e9c2/?" "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/109.0"

Captura de tela de 2023-01-30 11-53-53

FNakano commented 1 year ago

Thanks for your reply.

Hi. There are configured limits for the request body size:

193M triples is a lot of data. I would try to import it directly into the triplestore instead.

How can I import directly to the triplestore?

namedgraph commented 1 year ago

For example, using the SPARQL Graph Store Protocol:

Map fuseki-end-user/fuseki-admin service ports to the host as shown here:

Then end-user and admin Fuseki endpoints will be available on http://localhost:3031/ds and http://localhost:3030/ds, respectively.

FNakano commented 1 year ago

It worked... I think... a select count()... sparql query on the imported graph reported almost 1,5M triples.

    GRAPH <http://example/lab8>
    { ?s ?p ?o }

Steps and Success Indicators

Map fuseki-end-user/fuseki-admin service ports to the host as shown here:


      - 3030:3030
      - 3031:3030

into LinkedDataHub docker-compose.yml and restarted it with docker-compose up --buld

Indicator: browsed http://localhost:3031/ds and http://localhost:3030/ds. Firefox downloaded ds.trig files, one containing user data, other containing admin data.

For example, using the SPARQL Graph Store Protocol:

Downloaded (currently) latest Fuseki binary: unzipped it. SOH executables are inside apache-jena-fuseki-4.7.0/bin folder.

Then end-user and admin Fuseki endpoints will be available on http://localhost:3031/ds and http://localhost:3030/ds, respectively.

Inserted data by running ./s-put http://localhost:3031/ds http://example/lab8 ~/MeuGithub/CSV2RDF/example/lab8.ttl

Indicator: use LinkedDataHub SPARQL Editor to runthe SPARQL query at the top of this post. Evaluate if inserted triples count is fine.