immunomind / immunarch

🧬 Immunarch: an R Package for Fast and Painless Exploration of Single-cell and Bulk T-cell/Antibody Immune Repertoires
https://immunarch.com
Apache License 2.0
305 stars 65 forks source link

Import C genes into immunarch with repLoad #156

Closed Khang-LQ closed 2 years ago

Khang-LQ commented 3 years ago

🚀 Feature

Immunarch's repLoad function currently creates a data frame with V.name, D.name and J.name, but C.name (the name of the constant region gene) is missing. This is currently the case both for data from MiXCR's clone tables and from Cell Ranger's filtered_contig_annotation files, even though the information for the C gene is present in both of them.

Motivation

Having the name of the constant region gene is important when doing analysis related to isotypes and would help visualize isotype distribution in a sample.

Pitch

The repLoad function should be able to import C gene names and display it as a column in the data frame next to the J.name column. image

C gene name information is already available in Cell Ranger's output (filtered_contig_annotations): image

and MiXCR's clone tables (highlighted): image

vadimnazarov commented 3 years ago

Hi Khang, we are working on it! We need a support for C segments for BCR analysis, so it's in progress. @Alexander230 take a look please

Khang-LQ commented 3 years ago

Thank you very much, I would love to try it out once it is available.

useryanran commented 2 years ago

Hi Khang, we are working on it! We need a support for C segments for BCR analysis, so it's in progress. @Alexander230 take a look please

Dear Vadimnazarov, I am also working on BCR analysis. Is there any progress on adding C gene into analysis? Thank you very much! Yanran

Alexander230 commented 2 years ago

Hello @Khang-LQ and @useryanran, Thank you for contacting us. My name is Aleksandr Popov, I am a developer of the Immunarch package.

I'm glad to inform you that Immunarch starting from version 0.6.7 supports importing C genes with repLoad from MiXCR format. You are welcome to use it!

Best regards, Aleksandr

Khang-LQ commented 2 years ago

Hi @Alexander230, I have updated Immunarch to 0.6.7 and tried loading a MiXCR clone table with repLoad. However, I still cannot see the column for C gene call in the Immunarch data frame. Is there a specific command that I need to write to enable this?

Alexander230 commented 2 years ago

Hi, @Khang-LQ, Could you, please, provide an example of data that you want to load? Maybe, C genes column is named differently, so it's not detected correctly in repLoad. The example will help me to make a fix.

Best regards, Aleksandr

plezar commented 2 years ago

Hi @Alexander230!

I am having the same problem. I have assembled full Ig receptor sequences using using mixcr exportClones command and the output tables all contain a column named allCAlignments. I have updated Immunarch to the latest version, but Ig isotype is still not detected. In case you are interested, here is an example of my data.

Alexander230 commented 2 years ago

Hi, @plezar!

Thank you for sending the example data! C gene names are successfully parsed from them into C.name column, and C alignment coordinates are currently not supported in release version of Immunarch. Unfortunately, allCAlignments column is empty in your example data. I've added experimental support for parsing this column, with assumption that it has the same format as D gene alignments. You can try to use Immunarch version from the experimental branch; to install it, use these commands:

install.packages(c("devtools", "pkgload"))
devtools::install_github("immunomind/immunarch", ref="feature/mixcr-c-alignments")
devtools::reload(pkgload::inst("immunarch"))

Best regards, Aleksandr

Khang-LQ commented 2 years ago

Hi @Alexander230 ,

Thank you for your help, now I am able to import C gene name from MiXCR data. I'm looking forward to importing C gene data from 10X CellRanger data as well.

Alexander230 commented 2 years ago

I've planned this feature for one of the next Immunarch releases. It will appear earlier on dev branch, so I will notify you when it will be ready.

Best regards, Aleksandr

Alexander230 commented 2 years ago

Hi, @Khang-LQ!

I've implemented an experimental branch with support for C genes in 10x format. You can try it by installing immunarch from experimental branch with these commands:

install.packages(c("devtools", "pkgload"))
devtools::install_github("immunomind/immunarch", ref="10x-c-gene")
devtools::reload(pkgload::inst("immunarch"))

When the fix will be finalized, it will be moved to dev branch; after that, to install it, use ref="dev" instead of ref="10x-c-gene" in these commands.

Best regards, Aleksandr

Khang-LQ commented 2 years ago

Hi @Alexander230. I have just tried out the C gene feature for 10X format and it is working very well. Thank you so much for the support.

Alexander230 commented 2 years ago

I'm glad to hear that the C genes support is working! I've merged it into dev branch, and it will be included in Immunarch 0.6.9, which will be released soon. If you have more questions, feel free to reopen the issue and ask them!

Best regards, Aleksandr

aislinnjennings commented 1 year ago

Hi! I was also wondering how to use the C.name column (bulk data) for repExplore and repDiversity functions please? I'm hoping to determine clonality and diversity indices according to B cell isotypes in my data. Thanks! Ais