Closed morellr closed 5 years ago
This is definitely a strange fasta name, since it's trying to get the splint position out of the fasta name, but when it splits the name, it's getting a string (c) instead of a number. Did you adjust your sequence headers?
Hi rvolden,
The read names before and then after C3POa_preprocessing look like this @672de281-f276-48f8-b1b2-b3e0d249c56c (readname in original AlbOut1/0/worspace/pass/*.fastq) @672de281-f276-48f8-b1b2-b3e0d249c56c_3075 (readname in R2C2_raw_reads.fastq) i.e. "_3075" is splint position
When I run C3POa.py the read names in the R2C2_Consensus.fasta appear to have lost the splint position information (?) e.g.
672de281-f276-48f8-b1b2-b3e0d249c56c_11.93_4271_1_2426 The initial stdout messaging on C3POa.py shows this error:
./tmp1/672de281-f276-48f8-b1b2-b3e0d249c56c_consensus_1.fasta ./tmp1/c1623b0d-fc3d-4f57-8ed8-408a70b7bd0e_consensus_1.fasta /usr/lib64/python3.6/site-packages/numpy/core/_methods.py:59: RuntimeWarning: Mean of empty slice. warnings.warn("Mean of empty slice.", RuntimeWarning) /usr/lib64/python3.6/site-packages/numpy/core/_methods.py:70: RuntimeWarning: invalid value encountered in double_scalars ret = ret.dtype.type(ret / rcount)
but proceeds to produce output that looked correct to me, and is read by C3POa_postprocessing.py, so I ignored the error message. But now I'm wondering if this is the step where I somehow fail to create the right fasta name.
After C3POa_postprocessing.py, the readname looks like this:
672de281-f276-48f8-b1b2-b3e0d249c56c_11.93_4271_1_2426_2107
This looks fine to me, since you shouldn't be needing the seed after C3POa.py. The numpy runtime warning is normal and can be ignored
Hi, When running defineAndQuantifyWrapper.py , the call to createConsensi.py generates this error:
Traceback (most recent call last): File "/usr/local/bin/createConsensi.py", line 221, in
corrected_consensus, repeats = determine_consensus(name, fasta, fastq)
File "/usr/local/bin/createConsensi.py", line 142, in determine_consensus
fastq_reads = read_fastq_file(fastq)
File "/usr/local/bin/createConsensi.py", line 101, in read_fastq_file
name, seed = name_root[0], int(name_root[1])
ValueError: invalid literal for int() with base 10: 'c'
I tried modifying line 101 in createConsensi.py from: name, seed = name_root[0], int(name_root[1]) to: name, seed = name_root[0], int(float(name_root[1])) and got the same error.
Then I tried: name, seed = name_root[0], int(name_root[1], 16) and this ran to completion and generated output, however the Isoform_Consensi.fasta and Isoform_Consensi_filtered.fasta files have only the headers, no sequence. e.g.
Now I'm not sure whether to continue trying to figure out the problem in createConsensi.py, or perhaps I have generated illegal fasta names somehow in the C3PO pipeline?