immunomind / immunarch

🧬 Immunarch: an R Package for Fast and Painless Exploration of Single-cell and Bulk T-cell/Antibody Immune Repertoires
https://immunarch.com
Apache License 2.0
311 stars 66 forks source link

Support to repLoad the clone files generated by Trust4 program. #82

Open sciencepeak opened 4 years ago

sciencepeak commented 4 years ago

🚀 Feature

Support to repLoad the clone files generated by Trust4 program.

Motivation

Among the clonotype identification tools, Trust4 is an emerging tool that are recently updated. Its previous paper were published on Nature: https://dx.doi.org/10.1038%2Fng.3581

I am not the developer of Trust4, I am just a user, however, the preliminary results of Trust4 show that the tool yields high quality clones. We want to chain the Trust4 and immunarch seamlessly in our research. We hope to use immunarch to analyze the results from Trust4 without manually converting the Trust4 result format to VDJtool result format.

Pitch

I wish to have repLoad() directly read the clonotype files from TRUST4. If there are some fields that are missed in the Trust4 results but needed in the immunarch input, please contact the Trust4 developers to request a new feature. https://github.com/liulab-dfci/TRUST4

Alternatives

Additional context

vadimnazarov commented 4 years ago

Hi @whitehilltea

Thank you for the suggestion! Can you provide an example output of the software tool so we can test the immunarch parser on it please?

sciencepeak commented 4 years ago

Hi, @vadimnazarov

Thanks for considering the feature request. trust4_result.zip

Alexander230 commented 2 years ago

Hi, @sciencepeak! My name is Aleksandr Popov, I am a developer of the Immunarch package. Thank you for using our software!

I'm happy to inform you that Immunarch starting from version 0.6.7 supports Trust4 format in repLoad function. You are welcome to use it!

Good luck, Aleksandr

sciencepeak commented 2 years ago

@Alexander230 Thanks for adding the feature. I tested the new feature, and found that the automatical detection of the clonotyping tool label the trust4 as vdjtools. Perhaps there are more improvement room.

# Replace it with the path to your data. Immunarch automatically detects the file format.
immdata <- repLoad(file.path(getwd(), "trust_clones_result_directory")) 
== Step 1/3: loading repertoire files... ==

Processing "D://trust_clones_result_directory" ...
  -- [1/60] Parsing "D://trust_clones_result_directory/SAMN13223146.trust_clones.txt.bz2" -- vdjtools
  -- [2/60] Parsing "D://trust_clones_result_directory/SAMN13223147.trust_clones.txt.bz2" -- vdjtools
  -- [3/60] Parsing "D://trust_clones_result_directory/SAMN13223148.trust_clones.txt.bz2" -- vdjtools
  -- [4/60] Parsing "D://trust_clones_result_directory/SAMN13223149.trust_clones.txt.bz2" -- vdjtools
  -- [5/60] Parsing "D://trust_clones_result_directory/SAMN13223150.trust_clones.txt.bz2" -- vdjtools
  -- [6/60] Parsing "D://trust_clones_result_directory/SAMN13223151.trust_clones.txt.bz2" -- vdjtools

See the example of the first file. SAMN13223146.trust_clones.txt

Alexander230 commented 2 years ago

Hi, @sciencepeak!

Thank you for providing the data! There is a possible bug, I will investigate it and noitfy you when there will be a fix.

Best regards, Aleksandr