hubmapconsortium / codex-pipeline

CODEX data processing code
GNU General Public License v3.0
10 stars 4 forks source link

Standardize antibody names from `antibodies.tsv` contents #6

Open mkeays opened 4 years ago

mkeays commented 4 years ago

Currently convert_to_ometiff.py adds the antigen names, taken from the Cytokit YAML config (in turn taken from the submitted channel names array), to the Channel element Name attribute. However, because different antibodies can be used to detect the same antigen, we should also add details of the antibody used to detect the antigen to the OME-XML.

To begin with this could go in the Channel element Name attribute as well, but we need to research if there is a more appropriate place for this in the OME-XML. Details that should be added are the antibody name, and the Antibody Registry ID.

Bob suggested adding in the following format: (antibody name)(antibody registry ID)(antigen name) e.g. (rabbit monoclonal anti-human CD8-alpha)(AB_2800052)(CD8e)

mkeays commented 4 years ago

The microscopy DRT is currently working on a tabular file to list details of all antibodies used in all datasets, indexed by dataset UUID. This would mean that for each dataset, we could pull out the exact antibody name and AR ID for each antigen, from this file. We're waiting for this file to be ready before we can add this to the code.

mruffalo commented 2 years ago

The information in antibodies.tsv in newer HuBMAP datasets may be usable for this. We should read this file if present, and either

mruffalo commented 1 year ago

Sean made a lot of progress on this in a branch -- assigning to Penny.

Replacing the channel names is relatively easy; I believe @SFD5311 encountered some issues with adding the previous channel names as structured annotations.

pennycuda commented 1 year ago

Just sorted the antibodies file by cycle number in convert_to_ometiff.py, still working on the rest of this issue