cov-lineages / pango-designation

Repository for suggesting new lineages that should be added to the current scheme
Other
1.04k stars 98 forks source link

ENH: Adding a structured lineage <=> issue mapping file #2109

Open corneliusroemer opened 1 year ago

corneliusroemer commented 1 year ago

Right now, we mention lineage-related issues in lineage_notes.txt which is a semi structured tsv file.

It would be less bug prone and easier for machines to parse if we added issue-lineage information in a separate lineage_issue.csv file.

AngieHinrichs commented 1 year ago

It should have either

or

AngieHinrichs commented 1 year ago

@ciscorucinski are you already making a file like this?

ciscorucinski commented 1 year ago

Not at the moment. I used a regex within Google Sheets to pull that information from each row. I haven't expanded on it as the github issues aren't structured for data processing. But here is a simpler version of the regex that seems to work https://regex101.com/r/Qfjcil/1

Just want to say that there is a reference to https://github.com/jmcbroome/auto-pango-designation, too, (EG.1) which also has an issue attached to it. No other designations seem to have multiple IDs attached to it.

ciscorucinski commented 1 year ago

Here is a very quick Google Sheets that splits them by each repository (except auto pango)

https://docs.google.com/spreadsheets/d/1NqUjXOn_JMCdZBHSzPMmxtvu8xcbDZJBEMm3dEVr5TY/edit?usp=sharing

AngieHinrichs commented 1 year ago

Awesome, thanks @ciscorucinski!

DailyCovidCases commented 5 months ago

Please close this issue