qmarcou / IGoR

IGoR is a C++ software designed to infer V(D)J recombination related processes from sequencing data. Find full documentation at:
https://qmarcou.github.io/IGoR/
GNU General Public License v3.0
48 stars 25 forks source link

Format of genomic references files #33

Open penuts7644 opened 5 years ago

penuts7644 commented 5 years ago

I have a question regarding the input FASTA files for the genomic references. Are these files supposed to follow a specific format, like IMGT's annotation for example? It looks like IGoR expects the header to be the gene name of the sequences. That would mean that I have to apply some pre-processing to the IMGT reference files before I can use them. Is this correct?

Cheers, Wout

qmarcou commented 5 years ago

Hi Wout, No there is no constraint on format except for the fact that naming should be consistent between alignments assignments and Gene choice event realizations. It's also worthy to note that long names (such as IMGT annotations) will require extra space to write alignment results (the gene name is written for each alignment as a text file). Best, Quentin