biocompibens / ALFA

ALFA: Annotation Landscape for Aligned Reads
MIT License
14 stars 2 forks source link

Error: the file 'GTF_FILE.stranded.ALFA_index.stranded.ALFA_index' doesn't exist. #6

Open LoransM opened 4 years ago

LoransM commented 4 years ago

Hello again,

After generating the ALFA index files I get an issue when processing reads:

rna2@d33502:~/ALFA$ python3 alfa.py -g GTF_FILE.stranded.ALFA_index --bam iseq100.l20.n0.a.m5.bam -s forward -d 1 -t 0 100 --pdf output.pdf

ALFA

Checking parameters

Error: the file 'GTF_FILE.stranded.ALFA_index.stranded.ALFA_index' and/or the file 'GTF_FILE.stranded.ALFA_index.unstranded.ALFA_index' doesn't exist.

End of program

I am very kean to use this program and would appreciate your help, Thanks

mbahin commented 4 years ago

Hello,

Actually, when you generated the index, you have specified an index name (argument "-g") that you should reuse in your next command (still "-g" argument), without "(un)stranded.ALFA_index" suffix.

Cheers, Mathieu

LoransM commented 4 years ago

Thanks. I tried it out and now I am getting another error:

rna2@d33502:~/ALFA$ python3 alfa.py -g GTF_FILE --bam iseq100.l20.n0.a.m5.bam iseq100.l20.n0.a.m5 -s forward -d 1 -t 0 100 --pdf output.pdf

ALFA

Checking parameters

Reference genome chromosome(s): ['1', '18', '19', '2', '20', '21', '22', '3', '4', '5', '6', '10', '7', '8', '9', 'MT', 'X', 'GL000191.1', 'GL000192.1', 'GL000193.1', 'GL000194.1', 'GL000195.1', 'GL000196.1', 'GL000199.1', 'GL000201.1', 'GL000204.1', 'GL000205.1', 'GL000209.1', 'GL000211.1', 'GL000212.1', 'GL000213.1', 'GL000215.1', 'GL000216.1', 'GL000218.1', 'GL000219.1', 'GL000220.1', 'GL000221.1', 'GL000222.1', 'GL000223.1', 'GL000224.1', 'GL000225.1', 'GL000228.1', 'GL000229.1', 'GL000230.1', 'GL000231.1', 'GL000233.1', 'GL000236.1', 'GL000237.1', 'GL000240.1', 'GL000241.1', 'GL000242.1', 'GL000243.1', 'GL000247.1', 'Y', '11', '12', '13', '14', '15', '16', '17']

BAM file chromosome(s): ['chr10', 'chr11', 'chr11_gl000202_random', 'chr12', 'chr13', 'chr14', 'chr15', 'chr16', 'chr17_ctg5_hap1', 'chr17', 'chr17_gl000203_random', 'chr17_gl000204_random', 'chr17_gl000205_random', 'chr17_gl000206_random', 'chr18', 'chr18_gl000207_random', 'chr19', 'chr19_gl000208_random', 'chr19_gl000209_random', 'chr1', 'chr1_gl000191_random', 'chr1_gl000192_random', 'chr20', 'chr21', 'chr21_gl000210_random', 'chr22', 'chr2', 'chr3', 'chr4_ctg9_hap1', 'chr4', 'chr4_gl000193_random', 'chr4_gl000194_random', 'chr5', 'chr6_apd_hap1', 'chr6_cox_hap2', 'chr6_dbb_hap3', 'chr6', 'chr6_mann_hap4', 'chr6_mcf_hap5', 'chr6_qbl_hap6', 'chr6_ssto_hap7', 'chr7', 'chr7_gl000195_random', 'chr8', 'chr8_gl000196_random', 'chr8_gl000197_random', 'chr9', 'chr9_gl000198_random', 'chr9_gl000199_random', 'chr9_gl000200_random', 'chr9_gl000201_random', 'chrM', 'chrUn_gl000211', 'chrUn_gl000212', 'chrUn_gl000213', 'chrUn_gl000214', 'chrUn_gl000215', 'chrUn_gl000216', 'chrUn_gl000217', 'chrUn_gl000218', 'chrUn_gl000219', 'chrUn_gl000220', 'chrUn_gl000221', 'chrUn_gl000222', 'chrUn_gl000223', 'chrUn_gl000224', 'chrUn_gl000225', 'chrUn_gl000226', 'chrUn_gl000227', 'chrUn_gl000228', 'chrUn_gl000229', 'chrUn_gl000230', 'chrUn_gl000231', 'chrUn_gl000232', 'chrUn_gl000233', 'chrUn_gl000234', 'chrUn_gl000235', 'chrUn_gl000236', 'chrUn_gl000237', 'chrUn_gl000238', 'chrUn_gl000239', 'chrUn_gl000240', 'chrUn_gl000241', 'chrUn_gl000242', 'chrUn_gl000243', 'chrUn_gl000244', 'chrUn_gl000245', 'chrUn_gl000246', 'chrUn_gl000247', 'chrUn_gl000248', 'chrUn_gl000249', 'chrX', 'chrY']

Error: no matching chromosome between the BAM file 'iseq100.l20.n0.a.m5.bam' and the reference genome.

End of program

I mapped to hg19 Hope you can assist me

Marie Lorans PhD fellow Aarhus University Institute of Molecular Biology and Genetics C.F. Møllers Alle 3 8000 Aarhus Denmark Ph: +4531717672

mbahin commented 4 years ago

Hi again,

As you can see in the output, the chromosome names differs in your reference genome from the ones in your BAM file(s). ALFA has no way to infer the correspondence so you have to make them match either by changing them in the reference in the BAM file(s).

Cheers, Mathieu

LoransM commented 4 years ago

Hi again, By reference genome you mean the annotation file I have indexed?

Thanks so much!

Marie Lorans PhD fellow Aarhus University Institute of Molecular Biology and Genetics C.F. Møllers Alle 3 8000 Aarhus Denmark Ph: +4531717672

mbahin commented 4 years ago

Yes.

LoransM commented 4 years ago

Okay I did that. What are the associated label? I got this error code now:

rna2@d33502:~/ALFA$ python3 alfa.py -g iseq_index --bam iseq100.l20.n0.a.m5.bam bam Label1 -s forward -d 1 -t 0 100 --pdf output.pdf

ALFA

Checking parameters

Error: Make sure to follow the expected format: --bam BAM_file1 Label1 [BAM_file2 Label2 ...].

It is single reads I have, not paired.

Marie Lorans PhD fellow Aarhus University Institute of Molecular Biology and Genetics C.F. Møllers Alle 3 8000 Aarhus Denmark Ph: +4531717672

LoransM commented 4 years ago

I moved my bam files to another folder and now it works! Your help has been great, thanks so much!

Marie Lorans PhD fellow Aarhus University Institute of Molecular Biology and Genetics C.F. Møllers Alle 3 8000 Aarhus Denmark Ph: +4531717672

LoransM commented 4 years ago

Actually I was interested in quantifying proportion of reads mapping to miRNA and other types of non-coding RNA., as in the example (sample 1 and 2) you give in the instruction. But the graph output I got doesn’t include that (attached) Is it possible to include those RNA species in the count file and graphs?

Marie Lorans PhD fellow Aarhus University Institute of Molecular Biology and Genetics C.F. Møllers Alle 3 8000 Aarhus Denmark Ph: +4531717672