RECETOX / recedb

1 stars 0 forks source link

Research whether sqlite is a possible data format for Galaxy data tables/data files #1

Open martenson opened 3 years ago

martenson commented 3 years ago

afaik the common usage is in tsv format

https://galaxyproject.org/admin/tools/data-tables/

xtracko commented 3 years ago

Usually, I have seen that the data-tables are just used for storing the meta-information to some shared "payload" file (for example FASTA file, in our case to the SQLite file). This meta-information contains a path to the payload file, a human-readable name, the date of creation, etc. Based on these metadata the user can select the correct input file to his tool.

From your question I'm getting a vibe that you are suggesting to make a data-table of each chemical compound (the content of the sqlite files). Is there a reason to make Galaxy aware of each chemical compound? As I understand, Galaxy is also indexing the data-table files, so in case of a large number of entries, this approach might get slow.

martenson commented 3 years ago

Maybe the IUC's data_manager_ncbi_taxonomy_sqlite can serve as an inspiration. https://github.com/galaxyproject/tools-iuc/tree/master/data_managers/data_manager_ncbi_taxonomy_sqlite/data_manager

make a data-table of each chemical compound

I wasn't trying to suggest that, sorry for not being clear.