ZhangLabTJU / fastBCR

A heuristic method for fast BCR clonal family inference from large-scale AIRR-seq data
Other
3 stars 1 forks source link

Input format #3

Closed royfrancis closed 3 months ago

royfrancis commented 3 months ago

What data format do you have in example/COVID/COVID_1.zip? How do I get from my MIXCR output into a format suitable for fastBCR?

ZhangLabTJU commented 3 months ago

The data format we have in example/COVID/COVID_1.zip is Adaptive Immune Receptor Repertoire (AIRR) Rearrangement Schema (https://docs.airr-community.org/en/latest/datarep/rearrangements.html). However, we do not require the data to be in this format. You only need to find the V gene column (e.g., "Best V gene"), j gene column (e.g., "Best J gene"), and junction amino acid column (e.g., "AA SEQ CDR3") in your data, and modify their column names to 'v_call', 'j_call', and 'junction-aa' respectively, so that fastBCR can find the relevant information and implement its work. In addition, you can also use "mixcr exportAirr" in MiXCR to export output in AIRR format. In this way, you can directly use the output for fastBCR. 

royfrancis commented 3 months ago

Thanks! It works.