VisSieve / main

https://vissieve.github.io/main/documentation/site
0 stars 0 forks source link

export labelstudio information #11

Closed DevinBayly closed 1 week ago

DevinBayly commented 2 weeks ago

The process involves hitting export on the label studio instance and then selecting json

Image

Then we use

Image

from duck db https://duckdb.org/docs/guides/file_formats/json_import.html to get the data into our publications.db

this will be a much more verbose version of the json than we originally planned, but I figure it'll allow for the visualization tool to have more control over what it displays

DevinBayly commented 2 weeks ago

I'm planning to add in the new table figure_annotation_json

But this will also involve needing to update the id's to match the figure table which will hold the cyverse urls

DevinBayly commented 2 weeks ago

Actually it seems like there's a bit of problem, the label studio information is stored under author names, not according to the structure in this diagram and the helpful text Ben left us.

Image

Image

DevinBayly commented 2 weeks ago

Next steps are to see if Ben's upload is somewhere around.

Otherwise we can release the database without the connection back from the snapshots to the paper information.

I also think this might be a case where I can write something that redoes some of the figure retrieval and then compares it to the data that is already annotated

DevinBayly commented 1 week ago

will return to this when we have another set of the annottaions available. Right now we are overhauling the actual pdf figure gathering so export isn't quite ready for the new data