pombase / pombase-chado

PomBase code for accessing Chado
MIT License
5 stars 3 forks source link

Training datasets for ML/AI - publication centric #1181

Open ValWood opened 1 week ago

ValWood commented 1 week ago

Create a "publication centric" file containing all entities / annotations (all datatypes) for each publication.

Json?

kimrutherford commented 1 week ago

JSON makes sense. How urgent is this?

ValWood commented 1 week ago

It would be good to have it in a few of weeks I think to keep the ball rolling. I'm meeting the ePMC ML person on Monday if you want to join (forwarded the invite) v

kimrutherford commented 1 week ago

I'll start this on Monday. It might take a couple of days because the existing code needs improving first. A lot was written in a hurry for PomBase v2. Now I've had time (7 years?) to think about it, there are better ways to do things.

Proposed JSON structure (work in progress):

PMID:

kimrutherford commented 3 days ago

From Zoom: make sure to include annotation comments in the output.

ValWood commented 2 days ago

related: https://github.com/pombase/pombase-chado/issues/1185 we'll discuss this on the next call....