fanglab / nanodisco

nanodisco: a toolbox for discovering and exploiting multiple types of DNA methylation from individual bacteria and microbiomes using nanopore sequencing.
Other
66 stars 7 forks source link

"wrong sign in 'by' argument" while runnning nanodisco motif #72

Open monxac opened 1 year ago

monxac commented 1 year ago

Hi,

i'm trying to run the nanodisco motif script but i get this error,

[2023-05-31 13:21:36] Prepare output folder. [2023-05-31 13:21:36] Load supplied current differences. [2023-05-31 13:21:37] Detect motifs. [2023-05-31 13:21:37] Processing statistical signal. Error in do.ply(i) : task 12 failed - "wrong sign in 'by' argument" Calls: wrapper.motif.detection ... ddply -> ldply -> llply -> -> Execution halted

the current difference file looks normal, it has lots of NA, but there is data for some chunks, here are the first lines as an example

1 | megahit_805628 | 1 | rev | t | 0 | 0 | NA | NA | NA 2 | megahit_3546264 | 1 | rev | t | 0 | 0 | NA | NA | NA 3 | megahit_3546264 | 5001 | rev | t | 0 | 0 | NA | NA | NA 4 | megahit_7415660 | 1 | rev | t | 0 | 0 | NA | NA | NA 5 | megahit_3033896 | 1 | rev | t | 0 | 0 | NA | NA | NA 6 | megahit_3033896 | 5001 | rev | t | 0 | 0 | NA | NA | NA 7 | megahit_3033896 | 12402 | fwd | t | 0 | 0 | NA | NA | NA 8 | megahit_3033896 | 12404 | fwd | t | 0 | 0 | NA | NA | NA 9 | megahit_3033896 | 12405 | fwd | t | 0 | 0 | NA | NA | NA 10 | megahit_3033896 | 12406 | fwd | t | 0 | 0 | NA | NA | NA 11 | megahit_3033896 | 12411 | fwd | t | 0 | 0 | NA | NA | NA 12 | megahit_3033896 | 12415 | fwd | t | 0 | 0 | NA | NA | NA 13 | megahit_3033896 | 12416 | fwd | t | 0 | 0 | NA | NA | NA 14 | megahit_3033896 | 12420 | fwd | t | 0 | 0 | NA | NA | NA 15 | megahit_3033896 | 12423 | fwd | t | 0 | 0 | NA | NA | NA 16 | megahit_3033896 | 12424 | fwd | t | 0 | 0 | NA | NA | NA 17 | megahit_3033896 | 12427 | fwd | t | 0 | 0 | NA | NA | NA 18 | megahit_3033896 | 12428 | fwd | t | 0 | 0 | NA | NA | NA 19 | megahit_3033896 | 12429 | fwd | t | 0 | 0 | NA | NA | NA 20 | megahit_3033896 | 12430 | fwd | t | 0 | 0 | NA | NA | NA 21 | megahit_3033896 | 12431 | fwd | t | 0 | 0 | NA | NA | NA 22 | megahit_3033896 | 12432 | fwd | t | 0 | 0 | NA | NA | NA 23 | megahit_3033896 | 12433 | fwd | t | 0 | 0 | NA | NA | NA 24 | megahit_3033896 | 12434 | fwd | t | 0 | 0 | NA | NA | NA 25 | megahit_3033896 | 12435 | fwd | t | 0 | 0 | NA | NA | NA 26 | megahit_3033896 | 12436 | fwd | t | 0 | 0 | NA | NA | NA 27 | megahit_3033896 | 12437 | fwd | t | 0 | 0 | NA | NA | NA 28 | megahit_3033896 | 12438 | fwd | t | 0 | 0 | NA | NA | NA 29 | megahit_3033896 | 12439 | fwd | t | 0 | 0 | NA | NA | NA 30 | megahit_3033896 | 12440 | fwd | t | 0 | 0 | NA | NA | NA 31 | megahit_3033896 | 12441 | fwd | t | 0 | 0 | NA | NA | NA 32 | megahit_3033896 | 12442 | fwd | t | 0 | 0 | NA | NA | NA 33 | megahit_3033896 | 12443 | fwd | t | 0 | 0 | NA | NA | NA 34 | megahit_3033896 | 12444 | fwd | t | 0 | 0 | NA | NA | NA 35 | megahit_3033896 | 12445 | fwd | t | 0 | 0 | NA | NA | NA 36 | megahit_3033896 | 12446 | fwd | t | 6 | 16 | -1.8615509 | 0.0615589 | 0.070095 37 | megahit_3033896 | 12447 | fwd | t | 8 | 9 | -7.3153943 | 0.0147895 | 0.0206499 38 | megahit_3033896 | 12448 | fwd | t | 0 | 0 | NA | NA | NA

i've tried looking at the functions the error message is reffering to, to understand what is causing the errror, i think it's in this part of rollingFunction, called within the wrapper.motif.detection function:

genomic_position_range <- seq(min(genomic_position),max(genomic_position))

, but i don't see what is causing the problem,

would someone know what the problem is?

thank you in advance,

touala commented 11 months ago

Hello @monxac,

Sorry for the massive delay. It looks like the coverage is quite low for the couple contigs you showed. Could you produce the summary for the complete difference dataset with summary(readRDS(<path_to_rds>))?

Thank you,

Alan

monxac commented 11 months ago

summary.csv Hi! thank you for your answer,

i've attached the a document with the results of summary,

thank you,

Xavier