Open valentinedwv opened 10 months ago
@valentinedwv agreed!! I can share the following related from OIH
We tried to bring this together in a dashboard at: http://dashboard.oceaninfohub.org/ Jeff in the OIH project put this together, so I don't know much about it.
However, the performance is not great (pathetic to be honest). So now I preprocess all the SPARQL queries ahead of time into a parquet file for him and he is transitioning to using duckdb to access that file which is almost instantaneous.
I use https://github.com/iodepo/odis-arch/blob/master/graphOps/releaseGraphs/OIH_GraphPreProc.ipynb
to process the release graphs for OIH into Parquet and RDF (NQ) files.
Info about that is at https://github.com/iodepo/odis-arch/tree/master/graphOps/releaseGraphs
I use: https://github.com/iodepo/odis-arch/blob/master/graphOps/releaseGraphs/extraction_OIHDashboard.ipynb to test the query into those with DuckDB.
This is all a major work in progress still, so all test code. Also want to get a script to publish these files to Zenodo and get a DOI for the release graphs at major milestones.
Would love to discuss and plan a set of products for DeCODER similar in approach to what I have started there for OIH. I've not resolved a clear path for them yet either.
digging... Dashboard here.
Looks like it's dynamic, and runs the queries Present plan is to just run reports weekly, or when a data load is complete. So this backs that idea up. But just rework the UI to use non-dynamic sources.
Writing results to parquet, ok...
Looking to get queries into the utilities, so that are in one place, not two or three.
One thing that needs to be solved... where do we put all the queries... Thinking we pull them from some GitHub repo raw url so that all projects can share. Keep moving towards a shared architecture...
Repo would allow some organizing them into directories by task and project, and allow for experimental and borked directories.
Think running some of these, perhaps as a montly task would be good. https://github.com/earthcubearchitecture-project418/assay-data/blob/master/README.md
Put an issue for utilities