airr-community / airr-standards

AIRR Community Data Standards
https://docs.airr-community.org
Creative Commons Attribution 4.0 International
35 stars 23 forks source link

Germline labels #568

Closed williamdlees closed 2 years ago

williamdlees commented 2 years ago

closes #571

Add a description of the file structure to the Germline Sets section, and fix minor typos. Add GermlineSet and GenotypeSet to the list of high-level schema objects.

schristley commented 2 years ago

The file structure will need to match the DataFile which also has a direct correspondence to how data is returned from the ADC API.

williamdlees commented 2 years ago

Thanks Scott. I have added a reference to the AIRR DataFile,

williamdlees commented 2 years ago

Done, but can we please discuss their inclusion in the next major release.

bussec commented 2 years ago

@williamdlees There is no problem in including the Germline object in the next release of the Schema. But if we consider this to be required metadata for all studies, this would be an incompatible change of the MiAIRR standard an thus would have to wait until v2.0. Does this clarify my previous point?

schristley commented 2 years ago

MiAIRR does require a germline set reference, germline_database in set 5 (DataProcessing). It might be reasonable to now require that field to contain the globally unique germline reference ID, that is, germline_set_ref instead of being free-form text?

bussec commented 2 years ago

@schristley I agree, we should move in this direction. But IMO that's another pull request :smiley:

schristley commented 2 years ago

added some more detail about file structure, I think this might be ready to merge.