mandricigor / imrep

ImReP is a computational method for rapid and accurate profiling of the adaptive immune repertoire from regular RNA-Seq data.
https://mandricigor.github.io/imrep/
28 stars 14 forks source link

errors associated to pysam #27

Closed oeco28 closed 7 years ago

oeco28 commented 7 years ago

Dear Serghei, I am running into some issues associated with pysam, but I am not sure what the potential problem could be I have already pysam installed, because it is needed by other tools. I have a STAR alignment where all unmapped reads are kept and the command I have run is:

STAR --runMode alignReads --runThreadN 4 --genomeDir hg19_STAR --sjdbGTFfile /scratch/omar.cornejo_285595/marc/annotation_encode/gencode.v19.annotation.gtf --sjdbOverhang 80 --readFilesIn read1.fq.gz read2.fq.gz --readFilesCommand zcat --outFileNamePrefix mysample --outSAMunmapped Within --outSAMattributes NH HI NM MD --readNameSeparator space --outFilterMultimapScoreRange 1 --outSAMtype BAM SortedByCoordinate

I have run imrep with the obtained bam: $python imrep.py --bam mysample.bam TCR/sample1.cdr3

and I get the following error: Parse bam file with mapped and unmapped reads Traceback (most recent call last): File "imrep.py", line 953, in extract_mapped(i,file,k) File "imrep.py", line 97, in extract_mapped for read in samfile.fetch(chr,x,y): File "pysam/libcalignmentfile.pyx", line 855, in pysam.libcalignmentfile.AlignmentFile.fetch (pysam/libcalignmentfile.c:11188) File "pysam/libcalignmentfile.pyx", line 777, in pysam.libcalignmentfile.AlignmentFile.parse_region (pysam/libcalignmentfile.c:10647) ValueError: invalid reference 14

Is this something you have experienced before? Or do you have some insight into what the problem could be?

regards, Omar

smangul1 commented 7 years ago

Hi Omar,

Can you please look at your bam file and send me one line of it? I suspect that the chromosome names are chr1, ....

In case the chromosome names are in UCSC format (i.e. are composed of string chr and the chromosome number). For example, chr1. In this case, you need to use --chrFormat2 option

Thanks, Serghei

oeco28 commented 7 years ago

apologies for the mistake. You are totally right. the bam is in chr1, chr2... etc format. thanks!

Omar

smangul1 commented 7 years ago

Great! I am glad it worked out. I am closing the issue now