Closed anaesco closed 8 months ago
Hi @anaesco,
Your first data.log
first use the genome reference instead of the transcriptome reference while your second data.log
file uses the transcriptome reference which is correct. However, xpore dataprep cannot match to the reference as the first column of
the eventalign.txt file may look like the following:
ENST00000416931.1|ENSG00000225972.1|OTTHUMG00000002338.1|OTTHUMT00000006720.1|MTND1P23-201|MTND1P23|372|unprocessed_pseudogene|
while the reference transcript id looks like the following:
ENST00000416931.1
One thing you will have to do is to modified the first column of the eventalign.txt file to match the reference (here is an example python script (not tested)):
import sys
fn=sys.argv[1]
file=open(fn,'r')
outfile=open('new'+fn,'w')
for ln in file:
ln=ln.split('\t')
first_col=ln[0].split('|')[0]
newln=[first_col]+ln[1:]
outfile.write('\t'.join(newln))
outfile.close()
Thanks!
Best wishes, Yuk Kei
Hi there!
Thanks so much for responding. I figured it was the first column in the eventalign file since I was reading the closed issues section and someone had a similar issue.
The file is now edited and I was able to run diffmod! Thank you so much for your help.
Here is the output of the new file
contig position reference_kmer read_index strand event_index event_level_mean event_stdv event_length model_kmer model_mean model_stdv standardized_level start_idx end_idx ENST00000416931.1 20 AAGGG 9 t 217 119.94 11.2150.01062 AAGGG 113.12 7.84 0.73 20790 20822 ENST00000416931.1 20 AAGGG 9 t 218 132.64 3.769 0.00232 AAGGG 113.12 7.84 2.09 20783 20790 ENST00000416931.1 21 AGGGA 9 t 219 106.53 2.761 0.00199 AGGGA 115.88 4.05 -1.94 20777 20783
As a final question,
Is there any documentation as to which code was used to produce the results (data visualization)? Any guidance would be great! thank you!!
Hi @anaesco,
We have a tutorial that will be included in ONT's EPI2ME-LABS soon, which includes some visualization codes. It is attached below: https://drive.google.com/file/d/1xEddN-1mFXeKsaxqCmLt4q-lVvMhL_A4/view?usp=share_link
Thanks!
Best wishes, Yuk Kei
Hi there,
I have successfully ran the data preparation from raw reads and the xpore preprocess steps, but once I ran xpore diffmod, the files are empty. This is the command line argument: $ xpore diffmod --config IGF2_SCRAM_cell_config_nofilt.yml Using the signal of unmodified RNA from /home/carter-balaj/miniconda3/lib/python3.9/site-packages/xpore/diffmod/model_kmer.csv 0 ids to be testing ...
Any guidance is greatly appreciated thanks!!
Attached are also the files from the preprocess. data.log
data.log