globalbioticinteractions / elton

Access, review and index existing species interaction datasets
GNU General Public License v3.0
3 stars 2 forks source link

Providing a custom interaction_types_mapping.csv #61

Open zedomel opened 5 days ago

zedomel commented 5 days ago

Hi @jhpoelen

The idea is to have how to execute elton passing the interaction_types_mapping.csv file location. It can be included in the globi.json, for example:

awk -F '\t' -v version_anchor="${VERSION_ANCHOR}" '{ print "{ \"namespace\": \"" $1 "\", \"citation\": \"<" $1 "> <http://www.w3.org/ns/prov#wasDerivedFrom> <" version_anchor "> .\", \"format\": \"dwca\", \"url\": \"https://linker.bio/" $1 "\", \"interaction_types_mapping_location\": \"/home/john/interaction_types_mapping.csv\" }" }' 

Or as elton parameter: elton stream --interaction-types-mapping=/home/john/interaction_types_mapping.csv

I like the first option, since the usage of a custom interaction types mapping can be preserved in the metadata (globi.json).

What do you think?

best. josé.

jhpoelen commented 4 days ago

Thanks for your suggestion! Perhaps we can do both support for globi.json as well as the command-line option, so that you can run elton with or without globi.json .

Also, I'd be curious to hear your ideas on how to combine the Preston idea of recording associations with locations (e.g., /home/john/interaction_types_mapping.csv) and their content signatures (e.g., hash://sha256/abc...).