Closed izl2 closed 3 years ago
Hi @izl2 , you need to run misc/Slide_Variants.py in order to generate the file where variants are organized on kmer basis.
Hi Huanle,
This happened to me, too; epinano_variants.py generated two files: 1) minus_strand.per.site.csv 2) plus_strand.per.site.csv
So I was wondering, to run Slide_Variants.py, should I do it with the plus strand, right?
Something like this: python Slide_Variants.py plus_strand.per.site.csv 5
Hi @acarmas1 ,
The help message tells how to run it:
$ python misc/Slide_Variants.py
python Slide_Variants.py per_site_var kmer_length
please provide 1) variants table from Epinano_Variants and 2) windown size(integer)
You can combine plus and minus strands data after this.
Hi Huanle, thanks for replying
Just to make sure, should I run python Slide_Variants.py per_site_var kmer_length for both files? the plus and minus, and then combine them using 'cat' for example or what do you mean?
Hi @acarmas1 , Both cat -->slide and slide --> cat will work.
@Huanle @acarmas1
When I run Epinano_Variants.py, I only got the positive strand of output, e.g. wt.plus_strand.per.site.csv, not the minus strand. Do you know why I haven't got the minus strand output?
Also, after running Epinano_Variants.py, I have run Slide_Variants.py, but it takes so much time. I am wondering if there is a threads or processor options in the function.
Finally, If I don't use slide_variants.py and directly run Epinano_Predict.py after Epinano_variants.py, which model should I use, and how to set --columns
with that model?
I am looking forward to hearing from you.
Thank you!
Hi @acarmas1 ,
When I run Epinano_Variants.py, I only got the positive strand of output, e.g. wt.plus_strand.per.site.csv, not the minus strand. Do you know why I haven't got the minus strand output?
I guess you ran it in transcriptome mode?
Also, after running Epinano_Variants.py, I have run Slide_Variants.py, but it takes so much time. I am wondering if there is a threads or processor options in the function. I will find time to improve the codes.
Finally, If I don't use slide_variants.py and directly run Epinano_Predict.py after Epinano_variants.py, which model should I use, and how to set --columns with that model? If so, you will need to re-train models and reformat the input format.
Hope this helps. I will inform you once I finish improving the codes.
Best, Huanle
Hi, yes I remembered I run it in transcriptome mode.
Hi @acarmas1 ,
When I run Epinano_Variants.py, I only got the positive strand of output, e.g. wt.plus_strand.per.site.csv, not the minus strand. Do you know why I haven't got the minus strand output?
I guess you ran it in transcriptome mode?
Also, after running Epinano_Variants.py, I have run Slide_Variants.py, but it takes so much time. I am wondering if there is a threads or processor options in the function. I will find time to improve the codes.
Finally, If I don't use slide_variants.py and directly run Epinano_Predict.py after Epinano_variants.py, which model should I use, and how to set --columns with that model? If so, you will need to re-train models and reformat the input format.
Hope this helps. I will inform you once I finish improving the codes.
Best, Huanle
Hi @kwonej0617
@acarmas1 Thank you for your reply! @Huanle Yes, please let me know if you improve your code! Thank you so much. Meanwhile, I wanted to try to split my large bam file into multiple bam files and try to run slide_variants.py. Do you have any software you recommend or you used to split the bam file? Thank you.
@acarmas1 Thank you for your reply! @Huanle Yes, please let me know if you improve your code! Thank you so much. Meanwhile, I wanted to try to split my large bam file into multiple bam files and try to run slide_variants.py. Do you have any software you recommend or you used to split the bam file? Thank you.
@kwonej0617 ,
You can give it a try with bamtools.
bamtools split -in file.bam -reference
would do the job for you. If you are farmiliar with pysam
, a few lines of Python codes should also help do the same task.
Thank you so much for your advice! Also, I am looking forward to hearing from you about improving the code in slide_variants mode. I would be very appreciate it if you could let me know! Thank you so much.
Hi!
I recently attempted to run Epinano_Variants.py, using the ko.bam, wt.bam, and ref.fa files provided in test_data. I was able to generate the ko.plus_strand.per.site.csv and wt.plus_strand.per.site.csv output in the same directory as the .bam and .fa files. However, I did not see any per_site.5mer.csv files, which I believe I need to use as part of EpiNano-SVM. Wondering if you had any idea where I might find these files?
Thanks so much!