atulkakrana / sPARTA

miRNA-target prediction and PARE-seq based validation tool - uses MapReduce model [Published]
Other
7 stars 7 forks source link

sPARTA running issue #2

Closed pokhrelsuresh closed 8 years ago

pokhrelsuresh commented 8 years ago

[suresh@tarkan sparta]$ python3 sPARTA.py -genomeFile tomato_genome.fa -annoType GFF -annoFile ITAG_pre2.3_gene_models.corrected.gff -genomeFeature 1 -miRNAFile miRNA_tomato_new.fa &

[1] 21606 [suresh@tarkan sparta]$ ++Checking for required libraries and components ###### --numpy : found --scipy : found

Fn: genomeReader ######################################### Caching genome fasta Genome dict prepared for 13 chromosome/scaffolds

Fn: gffParser ############################################ Entries in genome_info:34727 Fn: extractFeatures ######################################

-Caching gene coords for chromosome: ch01 and strand: c -Caching gene coords for chromosome: ch01 and strand: w -Caching gene coords for chromosome: ch02 and strand: c -Caching gene coords for chromosome: ch02 and strand: w -Caching gene coords for chromosome: ch03 and strand: c -Caching gene coords for chromosome: ch03 and strand: w -Caching gene coords for chromosome: ch04 and strand: c -Caching gene coords for chromosome: ch04 and strand: w -Caching gene coords for chromosome: ch05 and strand: c -Caching gene coords for chromosome: ch05 and strand: w -Caching gene coords for chromosome: ch06 and strand: c -Caching gene coords for chromosome: ch06 and strand: w -Caching gene coords for chromosome: ch07 and strand: c -Caching gene coords for chromosome: ch07 and strand: w -Caching gene coords for chromosome: ch08 and strand: c -Caching gene coords for chromosome: ch08 and strand: w -Caching gene coords for chromosome: ch09 and strand: c -Caching gene coords for chromosome: ch09 and strand: w -Caching gene coords for chromosome: ch10 and strand: c -Caching gene coords for chromosome: ch10 and strand: w -Caching gene coords for chromosome: ch11 and strand: c -Caching gene coords for chromosome: ch11 and strand: w -Caching gene coords for chromosome: ch12 and strand: c -Caching gene coords for chromosome: ch12 and strand: w -Caching gene coords for chromosome: ch13 and strand: c -Caching gene coords for chromosome: ch13 and strand: w Number of coords in 'coords' list: 34386 Fn: getFASTA1

++Reading chromosome:ch01 and strand:'c' ################ --Fetching gene:Solyc01g005020_down Traceback (most recent call last): File "sPARTA.py", line 2631, in main() File "sPARTA.py", line 2369, in main fastaOut,fastaList = getFASTA1(args.genomeFile,coords,chromoDict) ##Creates FASTA file File "sPARTA.py", line 598, in getFASTA1 chromo = str(chromoDict[chrKey]) KeyError: 'ch01'

atulkakrana commented 8 years ago

Hi Suresh,

There seems to be an issue with your genome of GFF files. Both should have same identifiers for chromosomes. I am guessing that your genome might have '>Chr' as chromosome identifiers and the GFF has 'ch1'. This mismatch is unusual.

Fix identifiers in either one of them to fix the issue with your files, or download both genome and GFF from same source.

Atul

pokhrelsuresh commented 8 years ago

thanks Atul, it worked.