milaboratory / mixcr

MiXCR is an ultimate software platform for analysis of Next-Generation Sequencing (NGS) data for immune profiling.
https://mixcr.com
Other
323 stars 78 forks source link

IMGT TRA download error #112

Closed fabio-t closed 8 years ago

fabio-t commented 8 years ago
fabio@ahmad-desktop:~$ mixcr importFromIMGT
Starting importFromIMGT.sh script
/home/fabio/.linuxbrew/Cellar/mixcr/1.8.1-1/mixcr
MiXCR v1.8.1 (built Wed Jun 29 10:35:12 CEST 2016; rev=d7c1254; branch=hotfix/v1.8.1)
Components: 
MiLib v1.5-SNAPSHOT (rev=27a3ea6; branch=develop)
By using this script you agree to the terms of use of IMGT website. (see http://www.imgt.org/ for details).
Press ENTER to continue or other key to exit...
Available species:
(0) Bos taurus
(1) Camelus dromedarius
(2) Canis lupus familiaris
(3) Cercocebus atys
(4) Danio rerio
(5) Homo sapiens
(6) Macaca fascicularis
(7) Macaca mulatta
(8) Macaca nemestrina
(9) Mus
(10) Mus cookii
(11) Mus minutoides
(12) Mus musculus
(13) Mus pahari
(14) Mus saxicola
(15) Mus spretus
(16) Oncorhynchus mykiss
(17) Ornithorhynchus anatinus
(18) Oryctolagus cuniculus
(19) Ovis aries
(20) Papio anubis anubis
(21) Rattus norvegicus
(22) Rattus rattus
(23) Sus scrofa
(24) Vicugna pacos
Please select species (e.g. '5' for Homo sapiens): 12
You selected: Mus musculus.
Please enter a list of common species names for Mus musculus delimited by ':' to be used in -s option in 'mixcr align ...' (e.g. 'hsa:hs:homosapiens:human'): mm
Getting taxonId for Mus musculus from NCBI... OK. TaxonId=10090
Creating directory for downloaded files (./imgt_downloads/)
Downloading files:
./imgt_downloads/Mus_musculus_IGHV.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_IGHD.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_IGHJ.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_IGKV.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_IGKJ.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_IGLV.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_IGLJ.fasta successfully downloaded.

./imgt_downloads/Mus_musculus_TRAV.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_TRAJ.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_TRBV.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_TRBD.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_TRBJ.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_TRDV.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_TRDD.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_TRDJ.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_TRGV.fasta successfully downloaded.
./imgt_downloads/Mus_musculus_TRGJ.fasta successfully downloaded.
Importing loci:
IGH
/home/fabio/.linuxbrew/Cellar/mixcr/1.8.1-1/mixcr importSegments -f -s 10090:mm -l IGH -r report_Mus_musculus_IGH.txt -v ./imgt_downloads/Mus_musculus_IGHV.fasta -d ./imgt_downloads/Mus_musculus_IGHD.fasta -j ./imgt_downloads/Mus_musculus_IGHJ.fasta
Warning: absent conserved Cys in functional allele IGHV1-18*02
Warning: absent conserved Cys in functional allele IGHV1-18*03
Warning: absent conserved Cys in functional allele IGHV1-42*02
Warning: absent conserved Cys in functional allele IGHV1-53*02
Warning: absent conserved Cys in functional allele IGHV1-53*03
Warning: absent conserved Cys in functional allele IGHV1-53*04
Warning: absent conserved Cys in functional allele IGHV1-55*02
Warning: absent conserved Cys in functional allele IGHV1-55*03
Warning: absent conserved Cys in functional allele IGHV1-55*04
Warning: absent conserved Cys in functional allele IGHV1-62-3*02
Warning: absent conserved Cys in functional allele IGHV1-64*02
Warning: absent conserved Cys in functional allele IGHV1-69*03
Warning: absent conserved Cys in functional allele IGHV1-72*02
Warning: absent conserved Cys in functional allele IGHV1-72*03
Warning: absent conserved Cys in functional allele IGHV1-72*05
Warning: absent conserved Cys in functional allele IGHV1-74*02
Warning: absent conserved Cys in functional allele IGHV1-74*03
Warning: absent conserved Cys in functional allele IGHV1S100*01
Warning: absent conserved Cys in functional allele IGHV1S103*01
Warning: absent conserved Cys in functional allele IGHV1S107*01
Warning: absent conserved Cys in functional allele IGHV1S108*01
Warning: absent conserved Cys in functional allele IGHV1S111*01
Warning: absent conserved Cys in functional allele IGHV1S112*02
Warning: absent conserved Cys in functional allele IGHV1S113*01
Warning: absent conserved Cys in functional allele IGHV1S113*02
Warning: absent conserved Cys in functional allele IGHV1S118*01
Warning: absent conserved Cys in functional allele IGHV1S120*01
Warning: absent conserved Cys in functional allele IGHV1S120*02
Warning: absent conserved Cys in functional allele IGHV1S121*01
Warning: absent conserved Cys in functional allele IGHV1S122*01
Warning: Skipping IGHV1S20*01 because it's sequence contains wildcards.
Warning: Skipping IGHV1S21*01 because it's sequence contains wildcards.
Warning: absent conserved Cys in functional allele IGHV1S21*02
Warning: absent conserved Cys in functional allele IGHV1S29*01
Warning: absent conserved Cys in functional allele IGHV1S31*01
Warning: absent conserved Cys in functional allele IGHV1S32*01
Warning: absent conserved Cys in functional allele IGHV1S33*01
Warning: absent conserved Cys in functional allele IGHV1S37*01
Warning: absent conserved Cys in functional allele IGHV1S44*01
Warning: Skipping IGHV1S51*01 because it's sequence contains wildcards.
Warning: absent conserved Cys in functional allele IGHV1S65*01
Warning: absent conserved Cys in functional allele IGHV1S65*03
Warning: absent conserved Cys in functional allele IGHV1S67*01
Warning: absent conserved Cys in functional allele IGHV1S67*02
Warning: absent conserved Cys in functional allele IGHV1S68*01
Warning: absent conserved Cys in functional allele IGHV1S68*02
Warning: absent conserved Cys in functional allele IGHV1S70*01
Warning: absent conserved Cys in functional allele IGHV1S72*01
Warning: absent conserved Cys in functional allele IGHV1S73*01
Warning: absent conserved Cys in functional allele IGHV1S75*01
Warning: absent conserved Cys in functional allele IGHV1S75*02
Warning: absent conserved Cys in functional allele IGHV1S78*01
Warning: absent conserved Cys in functional allele IGHV1S81*01
Warning: absent conserved Cys in functional allele IGHV1S82*01
Warning: absent conserved Cys in functional allele IGHV1S83*01
Warning: absent conserved Cys in functional allele IGHV1S87*01
Warning: absent conserved Cys in functional allele IGHV1S92*01
Warning: absent conserved Cys in functional allele IGHV1S95*01
Warning: absent conserved Cys in functional allele IGHV1S96*01
Warning: absent conserved Cys in functional allele IGHV5-12*03
Processing...
Writing report.
Writing library file.
Checking.
Segments successfully imported.
IGK
/home/fabio/.linuxbrew/Cellar/mixcr/1.8.1-1/mixcr importSegments -f -s 10090:mm -l IGK -r report_Mus_musculus_IGK.txt -v ./imgt_downloads/Mus_musculus_IGKV.fasta -j ./imgt_downloads/Mus_musculus_IGKJ.fasta
Warning: Skipping IGKV1-117*02 because it's sequence contains wildcards.
Warning: absent conserved Cys in functional allele IGKV13-85*02
Warning: absent conserved Cys in functional allele IGKV13-85*03
Error: Duplicate records for allele IGKV10-94*01
Error: Duplicate records for allele IGKV10-96*01
Processing...
Writing report.
Writing library file.
Checking.
Segments successfully imported.
IGL
/home/fabio/.linuxbrew/Cellar/mixcr/1.8.1-1/mixcr importSegments -f -s 10090:mm -l IGL -r report_Mus_musculus_IGL.txt -v ./imgt_downloads/Mus_musculus_IGLV.fasta -j ./imgt_downloads/Mus_musculus_IGLJ.fasta
Warning: absent conserved Cys in functional allele IGLV4*01
Warning: absent conserved Cys in functional allele IGLV5*01
Warning: absent conserved Cys in functional allele IGLV6*01
Warning: absent conserved Cys in functional allele IGLV6*02
Warning: absent conserved Cys in functional allele IGLV6*03
Warning: absent conserved Cys in functional allele IGLV7*01
Warning: absent conserved Cys in functional allele IGLV7*02
Warning: absent conserved Cys in functional allele IGLV8*01
Warning: absent conserved Cys in functional allele IGLV8*02
Error: Duplicate records for allele IGLV2*01
Error: Duplicate records for allele IGLV3*01
Warning: absent conserved Cys in functional allele IGLV4*01
Error: Duplicate records for allele IGLV4*01
Warning: absent conserved Cys in functional allele IGLV4*02
Warning: absent conserved Cys in functional allele IGLV8*01
Error: Duplicate records for allele IGLV8*01
Error: Duplicate records for allele IGLJ4*01
Processing...
Writing report.
Writing library file.
Checking.
Segments successfully imported.
TRA
Special parameters for Mouse TRA/D genes activated.
/home/fabio/.linuxbrew/Cellar/mixcr/1.8.1-1/mixcr importSegments -f -s 10090:mm -l TRA -r report_Mus_musculus_TRA.txt -p imgt_a1 -v ./imgt_downloads/Mus_musculus_TRAV.fasta -j ./imgt_downloads/Mus_musculus_TRAJ.fasta
Warning: A instead of conserved Cys in functional allele TRAV12D-1*03
Exception in thread "main" java.lang.IllegalArgumentException: Unknown symbol "<"
    at com.milaboratory.core.sequence.AbstractArraySequence.dataFromChars(AbstractArraySequence.java:81)
    at com.milaboratory.core.sequence.AbstractArraySequence.<init>(AbstractArraySequence.java:29)
    at com.milaboratory.core.sequence.NucleotideSequence.<init>(NucleotideSequence.java:53)
    at com.milaboratory.mixcr.reference.builder.FastaLocusBuilder.importAllelesFromStream(FastaLocusBuilder.java:221)
    at com.milaboratory.mixcr.reference.builder.FastaLocusBuilder.importAllelesFromFile(FastaLocusBuilder.java:162)
    at com.milaboratory.mixcr.cli.ActionImportSegments.go(ActionImportSegments.java:135)
    at com.milaboratory.cli.JCommanderBasedMain.main(JCommanderBasedMain.java:147)
    at com.milaboratory.mixcr.cli.Main.main(Main.java:73)
TRB
/home/fabio/.linuxbrew/Cellar/mixcr/1.8.1-1/mixcr importSegments -f -s 10090:mm -l TRB -r report_Mus_musculus_TRB.txt -v ./imgt_downloads/Mus_musculus_TRBV.fasta -d ./imgt_downloads/Mus_musculus_TRBD.fasta -j ./imgt_downloads/Mus_musculus_TRBJ.fasta
Processing...
Writing report.
Writing library file.
Checking.
Segments successfully imported.
TRG
/home/fabio/.linuxbrew/Cellar/mixcr/1.8.1-1/mixcr importSegments -f -s 10090:mm -l TRG -r report_Mus_musculus_TRG.txt -v ./imgt_downloads/Mus_musculus_TRGV.fasta -j ./imgt_downloads/Mus_musculus_TRGJ.fasta
Warning: absent conserved Cys in functional allele TRGV2*05
Warning: absent conserved Cys in functional allele TRGV3*03
Processing...
Writing report.
Writing library file.
Checking.
Segments successfully imported.
TRD
Special parameters for Mouse TRA/D genes activated.
/home/fabio/.linuxbrew/Cellar/mixcr/1.8.1-1/mixcr importSegments -f -s 10090:mm -l TRD -r report_Mus_musculus_TRD.txt -p imgt_a1 -i -v ./imgt_downloads/Mus_musculus_TRDV.fasta -d ./imgt_downloads/Mus_musculus_TRDD.fasta -j ./imgt_downloads/Mus_musculus_TRDJ.fasta
Warning: G instead of conserved Cys in functional allele TRDV1*01
Warning: L instead of conserved Cys in functional allele TRDV2-1*01
Warning: L instead of conserved Cys in functional allele TRDV2-2*01
Warning: L instead of conserved Cys in functional allele TRDV2-2*02
Warning: S instead of conserved Cys in functional allele TRDV4*01
Warning: S instead of conserved Cys in functional allele TRDV5*01
Warning: S instead of conserved Cys in functional allele TRDV5*02
Warning: S instead of conserved Cys in functional allele TRDV5*03
Warning: S instead of conserved Cys in functional allele TRDV5*04
Processing...
Writing report.
Writing library file.
Checking.
Segments successfully imported.
Resulting file contains following records:
10090:TRB: 73 records
9606:TRD: 28 records
9606:TRA: 171 records
10090:TRD: 53 records
10090:IGH: 342 records
9606:TRG: 24 records
9606:IGH: 402 records
10090:IGL: 21 records
10090:TRG: 32 records
9606:IGK: 117 records
9606:IGL: 101 records
9606:TRB: 161 records
10090:IGK: 158 records

Operation executed successfully.
See report*.txt files created in current folder for details.

To use imported segments invoke mixcr with the following parameters:
mixcr align --library local -s mm ...

You can see what happens here: http://pastebin.com/AnSMwmaR

Essentially, HTML text is appended to the TRAV file.

dbolotin commented 8 years ago

I gust reproduced your command and everything worked. Try re-run the import script now, maybe it was some glitch on IMGT's server, that produced malformed response.

fabio-t commented 8 years ago

You are quite correct. Thank you!