Closed axfelix closed 4 years ago
Thanks for your question @axfelix, since the answer is really not obvious.
OpenRefine does not support this feature, but there were recent discussions in https://github.com/OpenRefine/OpenRefine/issues/715. There is a proposal for implementation in a comment from 2018 and an open question about the use case in a more recent comment. Maybe you can help and describe your use case there?
With the openrefine-client you can script a workaround:
Here is an example that replaces the existing project:
download example data and create project myproject
[felix@tux Desktop]$ openrefine-client --download "https://git.io/fj5hF" --output original.csv
Download to file original.csv complete
[felix@tux Desktop]$ openrefine-client --download "https://git.io/fj5hF" --output new.csv
Download to file new.csv complete
[felix@tux Desktop]$ openrefine-client --create original.csv --projectName myproject
id: 1733286564342
rows: 10
append rows from new.csv into project myproject
[felix@tux Desktop]$ openrefine-client --export myproject --output old.csv
Export to file old.csv complete
[felix@tux Desktop]$ openrefine-client --delete myproject
Project 1733286564342 has been successfully deleted
[felix@tux Desktop]$ zip combined.zip old.csv new.csv
adding: old.csv (deflated 52%)
adding: new.csv (deflated 52%)
[felix@tux Desktop]$ openrefine-client --create combined.zip --format csv --projectName myproject
id: 2231810029129
rows: 20
Note that the project id will change. I don't know a way to set the id manually.
If you want to distinguish between old and new data, you can use the additional flag includeFileSources:
[felix@tux Desktop]$ openrefine-client --create combined.zip --format csv --projectName myproject --includeFileSources true
id: 1615195201038
rows: 20
Thank you for the detailed response! I'll close this issue and comment upstream, but I wonder if this could be added to the readme as well, as it's quite useful.
Thanks for your suggestion! I have added a chapter to the README: https://github.com/opencultureconsulting/openrefine-client#append-data-to-an-existing-project
Sorry if this is an obvious question, but I can't quite see it from the examples. This works very well for exporting CSV from an existing project, but I can't see whether the reverse is possible, e.g., adding additional rows to an existing project via CLI. I've reviewed https://github.com/opencultureconsulting/openrefine-batch and I can't quite see it there, either...
Alternately, if this goes against OpenRefine's data model, it seems like it could be possible to automatically load a CSV into a new project following a template, and then merge it into the existing project?