Closed dietvin closed 1 year ago
Hi Vincent (@dietvin),
Do you mind showing the top 10 lines of your eventalign.txt
(by head eventalign.txt
) file, please?
Thanks!
Best wishes, Yuk Kei
The eventalign.txt looks like this:
contig position reference_kmer read_index strand event_index event_level_mean event_stdv event_length model_kmer model_mean model_stdv standardized_level start_idx end_idx
IRESeGFP5-complete 0 GGGCG 0 t 1 118.61 5.756 0.00465 GGGCG 108.23 5.31 1.70 26376 26390
IRESeGFP5-complete 0 GGGCG 0 t 2 106.47 6.813 0.00764 GGGCG 108.23 5.31 -0.29 26353 26376
IRESeGFP5-complete 1 GGCGA 0 t 3 106.97 9.815 0.02457 GGCGA 92.44 8.39 1.50 26279 26353
IRESeGFP5-complete 2 GCGAA 0 t 4 93.18 3.578 0.00896 GCGAA 92.59 3.99 0.13 26252 26279
IRESeGFP5-complete 3 CGAAT 0 t 5 119.27 5.203 0.00365 CGAAT 115.65 5.56 0.57 26241 26252
IRESeGFP5-complete 4 GAATT 0 t 6 114.70 1.150 0.00266 GAATT 112.11 3.11 0.73 26233 26241
IRESeGFP5-complete 4 GAATT 0 t 7 113.83 6.624 0.00664 GAATT 112.11 3.11 0.48 26213 26233
IRESeGFP5-complete 5 AATTG 0 t 8 102.80 5.963 0.00830 AATTG 100.78 5.53 0.32 26188 26213
IRESeGFP5-complete 6 ATTGG 0 t 9 84.47 1.255 0.00465 ATTGG 86.04 2.65 -0.52 26174 26188
Hi Vincent (@dietvin),
I ran xpore dataprep
with the 10 lines provided above, which outputted the following in eventalign.index
:
transcript_id,read_index,pos_start,pos_end
IRESeGFP5-complete,0,172,970
I do have a question. Is IRESeGFP5-complete
the only contig in your eventalign.txt
file?
Thanks!
Best wishes, Yuk Kei
Seeing that it works for you, I just rechecked my workflow and realized that I had a mixup with the names of the eventalign files and dataprep folders. I fixed that and now it works fine.
I'm sorry for the work that it caused, but thank you very much for your help! I really appreciate the fast and helpful replies.
Best Vincent
Hello,
when running diffmod on my data, I get an diffmod.table output that contains only the header. The diffmod.log and the dataprep data.log both show that the processes finished successfully.
While looking through the files I noticed that the eventalign.index only contains the header as well. All other files from the dataprep look as expected when compared to the test data.
I used the following commands: Nanopolish:
nanopolish eventalign -t 32 --scale-events --signal-index --reads [FASTQ] --bam [BAM] --genome [FASTA] > [EVENTALIGN]
xpore dataprep:
xpore dataprep --eventalign [EVENTALIGN] --out_dir [DATAPREP-DIR] --n_processes 32
xpore diffmod:
xpore diffmod --config $config --n_processes 32
While running dataprep I got the warnings below, but no error message. _/.../anaconda/envs/xpore/lib/python3.8/site-packages/xpore/scripts/dataprep.py:21: PerformanceWarning: indexing past lexsort depth may impact performance. pos_end += eventalign_result.loc[index]['linelength'].sum()
_/.../anaconda/envs/xpore/lib/python3.8/site-packages/xpore/scripts/dataprep.py:72: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy chunk_split['linelength'] = np.array(lines)
It would be great if you can help me out. Please let me know if you need any more information.
Thanks in advance Vincent