Closed bcorrie closed 4 months ago
Following suggestions for discussion:
ReceptorReactivity needs to be associated with both Cells and Receptors. The actual measurement is an observation from an experiment that links a specific Cell to a specific Epitope (at least in the case of a 10X study). That Cell can be associated with a Receptor. In that case, the ReceptorReactivity is also evidence of an association between the more global Receptor so the ReceptorReactivity should be associated with the Receptor as well.
In our current standard, there is currently no way to link a ReceptorActivity observation with a specific Cell. Since Receptors might have many ReceptorReactivity values, and they may come from many Cells from many experiments, it is currently not possible to determine which Cell a ReceptorReactivity came from.
Hence the addition of Cell.reactivity_measurements
@bussec @kira-neller does that capture our discussion?
@bcorrie Yes, this captures the necessary changes as far as I understand. Thank you!
@bcorrie Yes, it captures our discussion.
However, I came across an additional complication that we to think about: If we reference to ReceptorReactivity
records by their ID only (i.e., they are not nested into the Receptor
object), then you don't know which receptor the reactivity measurement refers to. From a Cell
record you could reconstruct this using both the receptors
and the reactivity_measurements
properties (which is already a bit of a pain), but for other potential references you would need to search all Receptor
records for a matching reactivity measurement ID. Therefore ReceptorReactity
needs to contain the receptor_hash
.
Ruminating about this, there is the also the situation that we discussed in which a cell expresses more than one receptor. If in such a case you have data from an multimer-MHC binding assay, you won't be able to know which receptor mediated the binding. Therefore the ReceptorReactivity
record would need to refer to multiple receptor_hash
IDs. Which is not a problem by itself, but we need to clearly document that in such a case the respective receptor might have been involved, but you cannot be certain about it.
However, I came across an additional complication that we to think about: If we reference to
ReceptorReactivity
records by their ID only (i.e., they are not nested into theReceptor
object), then you don't know which receptor the reactivity measurement refers to
I am not sure this is true is it... The Receptor
object has a list of ReceptorReactivity
IDs in it in the reactivity_measurements
array. So you can find all ReceptorActivity
entities for a Receptor
by looking at Receptor.reactivity_measurements
and you can find which Receptor
a ReceptorReactivity
entity refers to by searching all Receptor
objects for the recep[tor_activity_id
in the Receptor.reactivity_measurements
. So you can find the receptor with a relatively expensive query...
This is actually fairly cumbersome, and would be more elegant I think if the ReceptorReactivity object pointed directly to the Receptor object. We discussed this and for some reason decided that an array of Receptor.reactivity_measurements
was better. I can't remember why and I am not sure that was the right choice... 8-)
Isn't it always true that a single ReceptorReactivity
instance comes from one, and only one Cell
(the measurement for reactivity comes from a single cell, no?) and in your above scenario might point to a very small number of Receptors (e.g. the Cell expresses more than one Receptor and you don't know which Receptor is causing the reactivity)?
Maybe we should have ReceptorReactivity
having a cell_id
field and an array of receptor_id
field (where the array would have small N 1-2?). We could then get rid of the reactivity_measurement
fields and if you wanted to find all of the ReactivityMeasurement
fields associated with a Cell or Receptor, you search ReceptorReactity
objects for the cell_id or receptor_id of interest.
@bussec any comments on this. Would be good to close this off and merge with master.
Adding the other use case as per recent discussions in #705 for ReceptorReactivity
In my single-cell study stored in the ADC, I have found the following:
Cell
with IGHV1-46*01, STVVGAL, IGHJ4*02 and IGKV3-20*01, QQYGSSPLT, IGKJ4*01Receptor
has a known epitope binding in IEDB: https://www.iedb.org/receptor/193713Receptor
object to this effect.So I get something like this:
receptor_ref: ["IEDB_RECEPTOR:193713"]
Since ReceptorReactivity
information is in IEDB already, it may not be necessary to store this in the ADC with a ReceptorReactivity
record, as from the above link, one can get this information from IEDB. Presumably this is what the AKC project will address and make easy for the user. Currently the user has to jump back and forth between the ADC and IEDB link this information. The iReceptor Gateway already does this automagically for CDR3 searches if the CDR3 is known on IEDB.
If I want to track the known ReceptorReactivity
of the Receptor
in the ADC, I can create a ReceptorReactivity
object with the following fields, pulled from IEDB (https://www.iedb.org/assay/21965299).
This Receptor has 9 assays associated with it that returned activity, so presumably if I wanted to capture all of the reactivity for this receptor, I would have 9 different ReceptorReactivity
objects. For example one of the other assays (https://www.iedb.org/assay/21965295) is an Elisa assay so I would have something like the same as above with the exception of:
In trying to load epitope specificity, we have found a few problems. In discussions with @bussec we have come up with a set of suggested changes. Please discuss 8-)