wyang17 / SQuIRE

Software for Quantifying Interspersed Repeat Expression
Other
49 stars 29 forks source link

Error in Count for non-UCSC genome #75

Open sergiogonmoll opened 2 years ago

sergiogonmoll commented 2 years ago

Hi, I'm running SQuIRE to quantify TEs on a non-UCSC genome and I managed to finish the mapping step, which seems fine. Next, I tried the Count command for one of my samples and I get the following error message:

Script Arguments

count_folder=/crex/proj/snic2020-16-42/Sergio_temp/02_Scripts/squire_count/ EM=auto clean_folder=/crex/proj/snic2020-16-42/Sergio_temp/02_Scripts/squire_clean/ fetch_folder=/crex/proj/snic2020-16-42/Sergio_temp/01_RawData/Genome/squire_fetch/ name=CD50_cold tempfolder=False verbosity=True pthreads=8 strandedness=0 read_length=100 build=BUILD func=<function main at 0x2b8f33161140> map_folder=/crex/proj/snic2020-16-42/Sergio_temp/02_Scripts/squire_map/

Quantifying Gene expression 2022-04-17 14:43:29.298779

Running Guided Stringtie on each bamfile CD50_cold 2022-04-17 14:43:29.300392

Error: invalid genomic sequence data (NW_021164365.1)! Traceback (most recent call last): File "/sw/bioinfo/SQuIRE/885bf4d-20190301/rackham/bin/squire", line 11, in load_entry_point('SQuIRE', 'console_scripts', 'squire')() File "/sw/bioinfo/SQuIRE/885bf4d-20190301/src/SQuIRE/squire/cli.py", line 156, in main subargs.func(args = subargs) File "/sw/bioinfo/SQuIRE/885bf4d-20190301/src/SQuIRE/squire/Count.py", line 1569, in main Stringtie(bamfile,outfolder,basename,strandedness,pthreads,ingtf, verbosity,outgtf_ref_temp) File "/sw/bioinfo/SQuIRE/885bf4d-20190301/src/SQuIRE/squire/Count.py", line 169, in Stringtie sp.check_call(["/bin/sh", "-c", StringTiecommand]) File "/sw/comp/python/2.7.15_rackham/lib/python2.7/subprocess.py", line 190, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['/bin/sh', '-c', 'stringtie -p 8 -f 0.1 -m 200 -a 10 -j 1 -g 50 -M 0.95 -c 2.5 -e -o /crex/proj/snic2020-16-42/Sergio_temp/02_Scripts/squire_count/CD50_cold_outgtf_ref.tmpl2nBva -A /crex/proj/snic2020-16-42/Sergio_temp/02_Scripts/squire_count/CD50_cold_outabund_ref.tmpl2nBva -G /crex/proj/snic2020-16-42/Sergio_temp/01_RawData/Genome/squire_fetch/BUILD_refGene.gtf /crex/proj/snic2020-16-42/Sergio_temp/02_Scripts/squire_map/CD50_cold_E3_forward_paired.fq.bam']' returned non-zero exit status 1 I tried to find the genomic accession in the map LogOut file and it is there, so I don't understand where the command goes wrong. I appreciate any help!

sergiogonmoll commented 2 years ago

I updated the version of StringTie and this seemed to fix the issue, but now, I get the following KeyError:

File "/sw/bioinfo/SQuIRE/885bf4d-20190301/src/SQuIRE/squire/Count.py", line 218, in filter_tx gene_dict[(gtf_line.Gene_ID,gtf_line.strand)].add_counts(counts)
KeyError: ('', '+')