Closed DevinBayly closed 1 week ago
I'm planning to add in the new table figure_annotation_json
But this will also involve needing to update the id's to match the figure table which will hold the cyverse urls
Actually it seems like there's a bit of problem, the label studio information is stored under author names, not according to the structure in this diagram and the helpful text Ben left us.
Next steps are to see if Ben's upload is somewhere around.
Otherwise we can release the database without the connection back from the snapshots to the paper information.
I also think this might be a case where I can write something that redoes some of the figure retrieval and then compares it to the data that is already annotated
will return to this when we have another set of the annottaions available. Right now we are overhauling the actual pdf figure gathering so export isn't quite ready for the new data
The process involves hitting export on the label studio instance and then selecting json
Then we use
from duck db https://duckdb.org/docs/guides/file_formats/json_import.html to get the data into our publications.db
this will be a much more verbose version of the json than we originally planned, but I figure it'll allow for the visualization tool to have more control over what it displays