airr-community / airr-standards

AIRR Community Data Standards
https://docs.airr-community.org
Creative Commons Attribution 4.0 International
35 stars 23 forks source link

How to represent pcr_target_locus for paired chain data #645

Closed kira-neller closed 5 months ago

kira-neller commented 1 year ago

With v4.0 of the iReceptor Gateway, we've added the ability to query Cells in the ADC. With this comes the question of how to represent pcr_target_locus for paired chains from cells (our example 10x study has BCR, TCR, and GEX data).

The issue is that currently pcr_target_locus is restricted to controlled vocabulary: IGH, IGI, IGK, IGL, TRA, TRB, TRD, TRG. Additionally, pcr_target_locus is in the Repertoire metadata schema, not the Cell schema, so we're not sure how to link them.

For now, we've split B cells and T cells into separate SampleProcessing objects but left pcr_target_locus as null. Any suggestions on how to resolve this?

Thanks! Kira

bussec commented 1 year ago

As far as IG/TR loci are concerned, NucleicAcidProcessing.pcr_target is an array of PCRTarget objects. Therefore a single NucleicAcidProcessing record can describe multiple IG/TR target loci. Note that this is an experimental annotation, so it should describe what the researcher attempted to amplify, not whether this was successful. Therefore, assuming that 10X's mix contains primers for all loci, all of them should be included (also see our basic recommendations on 10X annotation).

This however does no address the relation between the objects. The Cell object does currently not contain any experimental annotation, as this was always assumed to come from the Repertoire (hence the repertoire_id property). If this leads to any ambiguity that we missed, we need to discuss how to fix it (but it won't be straight-forward).

kira-neller commented 1 year ago

Thanks for your comments. I understand what you mean about NucleicAcidProcessing.pcr_target being an experimental annotation, whether or not it is successful based on the actual data.

Regarding the relation between objects, I think any need for this should become clearer as we curate more single-cell studies, and it is something that could be addressed in a v2.0 of the standards.

bcorrie commented 5 months ago

@bussec @javh I am suggesting that we remove this from the v2 milestone (or close it), unless you are aware of any issues. Given the current state of our curation of single-cell studies, I am not seeing any critical issues in this regard, we seem to be able to capture the description of the experiment fine...

javh commented 5 months ago

Seems fine to close.

bussec commented 5 months ago

Fine with me.