Closed samir-watson closed 2 years ago
Hi @samir-watson ,
You made a good point.
Regarding sample-replicates, we did propose a method based on Epinano_predict
predictions in the publication.
I haven't come up with a good strategy with epinano_differr
. I think you might be able to either combine/pool all your Ks and Ws samples and run the analyses on the super-K and super-W or further filter/combine your results from independent K-W pairs to reach a consensus.
Hope this makes sense.
Hi Huanle, Thanks, yes that makes sense, Ill go ahead and combine the results from independent K-W pairs. Cheers, Samir
Hi, I have another question regarding replicates, this time using epinano-SVM. In your paper you used the pseudocode
if (s1 ≥ 0.5 and s2 ≥ 0.5 and s3 ≥ 0.5):
M = 1
else:
M = (s1 + s2 + s3)/3
where "M" im assuming is the ProbM column generated by Epinano_Predict
.
Then you use pseudocode 2 to determine modification status
Pseudocode 2:
if (Mwt/Mko) > 1.5 and Mwt > 0.5:
status = modified
else:
status = unmodified
For pseudocode 2, I am assuming that this is what Epinano_Predict also does as it gives a mod and unmod column, is this correct? Or is pseudocode 2 specifically used for replicates?
My question then relates to using the averaged ProbM to generate delta features. Looking at Epinano_make_delta.py
, it requires information on the mis1,mis2,mis3,mis4,mis5,ins1,ins2,ins3,ins4,ins5,del1,del2,del3,del4,del5 columns which were not accounted for in the pseudocode and thus not averaged. As I am now confused about how you go about averaging out the replicates could you please explain to me how you went about it?
Hi, I have another question regarding replicates, this time using epinano-SVM. In your paper you used the pseudocode
if (s1 ≥ 0.5 and s2 ≥ 0.5 and s3 ≥ 0.5): M = 1 else: M = (s1 + s2 + s3)/3
where "M" im assuming is the ProbM column generated by
Epinano_Predict
.
M is the final modification score determined with separate ProbMs (s1, s2, s3) computed with Epinano_Preidct for each sample!
Then you use pseudocode 2 to determine modification status
Pseudocode 2: if (Mwt/Mko) > 1.5 and Mwt > 0.5: status = modified else: status = unmodified
For pseudocode 2, I am assuming that this is what Epinano_Predict also does as it gives a mod and unmod column, is this correct? Or is pseudocode 2 specifically used for replicates?
Mwt and Mko are the M determined for Wt and Ko samples respectively using method denoted by pseudocode 1.
My question then relates to using the averaged ProbM to generate delta features. Looking at
Epinano_make_delta.py
, it requires information on the mis1,mis2,mis3,mis4,mis5,ins1,ins2,ins3,ins4,ins5,del1,del2,del3,del4,del5 columns which were not accounted for in the pseudocode and thus not averaged. As I am now confused about how you go about averaging out the replicates could you please explain to me how you went about it?
You can use delta features and Epinano_Predict to make predictions. Please refer to the examples included in https://github.com/novoalab/EpiNano/blob/e1a538eb0cdb36626a9060ea43bc8292d8491a3b/test_data/make_predictions/run.sh#L49 to carry out your analysis.
Please let me know if you have more questions.
Hi,
I have several replicates of WT and KO samples that I would like to compare with Epinano_DiffErr, but I am unsure at what point I define and input the replicates. I started with mapped bam files for each replicate and separately ran epinano_variants for each file which gave me individual .aligned.plus_strand.per.site.csv files. However when I input into Epinano_DiffErr multiple csv files for the -k and -w options the script is unable to recognise these as replicates. I have looked in the documentation but could not find information on how I should treat replicates.
Thanks, Samir