Closed tjroamer closed 2 years ago
Indeed, this would be very useful. My current response is to direct you to the OpenRefine API which allows for just such a thing. In the code, you can see the server-side commands to see the the GET and POST request and response elements. I haven't specifically documented them for API use but it should be doable.
Let me know if you have any success or trouble with that and I'll look into documenting a specific use. After getting the data loaded, it should resolve to using a SaveRDFTransformCommand and an OpenRefine export command using one of the registered RDF Transform exports. Iterate on each data file / project.
@tjroamer You might also ask @felixlohmeier if it's already possible and what his thoughts are with his OpenRefine client
@tjroamer You might also ask @felixlohmeier if it's already possible and what his thoughts are with his OpenRefine client
Thanks for pointing out this tool. I tried it and found it works for projects that use "RDF Extension". I got same RDF files as those exported from the browser app. Unfortunately it does not support projects with "RDF Transform".
If RDF transform requires specific API calls, then it would be great to document them. I probably won't get around to implementing this until next year. Maybe a maintainer of another client library will be faster.
Indeed, this would be very useful. My current response is to direct you to the OpenRefine API which allows for just such a thing. In the code, you can see the server-side commands to see the the GET and POST request and response elements. I haven't specifically documented them for API use but it should be doable.
Let me know if you have any success or trouble with that and I'll look into documenting a specific use. After getting the data loaded, it should resolve to using a SaveRDFTransformCommand and an OpenRefine export command using one of the registered RDF Transform exports. Iterate on each data file / project.
Thanks. I tried to export an existing RDF-Transform project to Turtle. When I exported a turtle in the browser app using Export->RDF Transform->Pretty Exports->RDF as Turtle
, the server console windows showed me [refine] POST /command/core/export-rows/myproject.ttl (9098ms)
, so I assume that this is the POST command I can use to execute the turtle export in batch mode. I ran the following command in a Postman window (2620164079995 is the project id):
http://localhost:3333/command/core/export-rows/myproject.ttl?project=2620164079995&
but I got the following error:
However, the following command works well, and I got the expected models
http://localhost:3333/command/core/get-models?project=2620164079995&
I assume that I was not using the correct API to export rows. You might be able to point out the errors.
Appreciate your help.
You're on the right path. I don't think the export command is the full URL or is missing components. It's a POST command, so It also needs to specify the export engine form parameters. See the OpenRefine API Export documentation. I'll take a closer look at it as well.
Here is an example with cURL that might help: https://gist.github.com/felixlohmeier/d76bd27fbc4b8ab6d683822cdf61f81d#file-templates-sh-L347
Thanks. I managed to get it work. I had forgotten the format
parameter in the previous session.
@AtesComp I used the following configuration for the export-rows:
project = 2620164079995
format = turtle
But the result I got is the same as that exported via browser app Export->RDF as Turtle
. This is not what I expected. Nevertheless, this might not be a suprise, because I did not tell OpenRefine any specifics about RDF-Transform. I assume RDF-Transform is using a specific engine to do export. It would be good to know what settings are necessary to get result like Export->RDF Transform->Pretty export->RDF as Turtle
.
Thanks.
The client side code uses an extra type designator for the format. So, (note the space)
format = "RDF/XML (Pretty)"
format = "Turtle (Pretty)"
format = "Turtle* (Pretty)"
format = "N3 (Pretty)"
...etc.
Types are:
" (Pretty)"
" (Blocks)"
" (Flat)"
" (Binary)"
The type is added to the appropriate export formats only. I took these types from the Jena documentation on streams vs the pretty formats. See RFDFormats. Then there is:
"RDFNull (Test)"
Try that and let me know. For more, the client side code is at this location . See the constructExportRDF()
function for strType
and the #exportRDF
function.
I should probably standardize these export format names on the actual Jena RDFFormat names. What do you think?
Great, it worked. I used the following configuration for the POST call:
# POST call
http://localhost:3333/command/core/export-rows
# parameters
project = 2620164079995
format = Turtle (Pretty)
# body, x-www-form-urlencoded
engine = {"facets":[],"mode":"record-based"}
Yes, it makes sense to take the documented Jena RDFFormat names. Thanks.
One step further towards the batch mode: I am about to create a project from an XML file with the OpenRefine API.
I have the following XML sample data:
<design>
<name>mydesign</name>
<port>
<var>
<name>var1</name>
</var>
<var>
<name>var2</name>
</var>
</port>
<port>
<var>
<name>var3</name>
</var>
</port>
</design>
The API call I used is:
# POST call
http://localhost:3333/command/core/create-project-from-upload
# body, form-data
project-name: mydesign
project-file: <I PASTED THE XML FILE CONTENT HERE>
format: text/xml
options: {"recordPath": ["design"], "trimStrings": true, "storeEmptyStrings": false}
The call has been executed successfully, but the created project does not contain any data. I suppose that I did not set the form-data correctly. Could you please take a look at? Thanks.
An update: I have been able to successfully create my XML project with this OpenRefine Client. Thanks your guys for your help!
That's fantastic.
I'll change the formats for the next release, so that WILL affect any command/core/export-rows
calls. I'll add a wiki page for command line / batch processing.
Also, the next release requires OpenRefine 3.6 or better as it supports the updated Jena lib and is not backward compatible as the RDFProto format was not properly supported in the prior Jena Lib and I need to register all export types before RDF Transform is successfully loaded. I wish there was a way to detect what formats are available before registering them but I currently don't have a way to do that.
I found this extension very useful. We have a very large set of XML files that share same structure, so one mapping file can apply to them all. However, it would be tedious to manually load them one by one to OpenRefine. I'd like to ask whether there is a batch mode that allows us to run the transformation in command line. Thanks.