Nanocompore stops after 50% of progress

vahidAK commented 4 years ago

Hi @tleonardi

I am using Nanocompore to find modified bases between 2 samples. It works fine and quick bu when it reaches 51% it just sticks to that and shows no progress or error: Untitled

I would appreciate it if you could help me with this issue.

Thanks, Vahid.

tleonardi commented 4 years ago

Hi @vahidAK, sorry to hear about the problem. From your screenshot it looks like Nanocompore exited: did you terminate it with Crtl+C or it exited on its own before reaching 100%? If you terminated it manually, how long did you wait at 51% before hitting Ctrl-C?

Also, I noticed you are using --min_coverage 3: three reads are definitely not enough to do the statistical testing, it's possible that with such low coverage one of the worker threads throws an exception which is not handled properly. Can you try running your command again with --min_coverage 30?

Lastly, you are using 84 threads while having only 51 transcripts. Since transcripts are processed in parallel over multiple threads, having more threads than transcripts is not going to help.. it's a situation that I haven't tested thoroughly but there's a chance that it's causing a deadlock: could you try running nanocompore again with a lower number of threads, e.g. -t 20?

vahidAK commented 4 years ago

Hi @tleonardi , No, I did not exit the job. I will use the recommendation and let you know.

Thanks, Vahid.

vahidAK commented 4 years ago

Hi @tleonardi ,

I ran it again. But it still has the same problem.

tleonardi commented 4 years ago

Hi @vahidAK, could you post another screenshot, the content of the log file and the list of files (with their sizes) contained in the results folder? thanks!

vahidAK commented 4 years ago

Hi @tleonardi ,

I have attached the screenshots

Screenshot from 2020-02-06 15-34-19

The terminated message at the end is because it just stopped there and I terminated the code.

Screenshot from 2020-02-06 15-34-58

Because one of my data does not have good coverage I used lower coverage than you suggested. I think it might be due to the large size and ram issue. Is it good to separate data based on the chromosome? Thanks

tleonardi commented 4 years ago

Hi @vahidAK, thanks for the info. RAM size is usually not an issue, those big files are never loaded in memory in one go. Rather we use the index (.idx files) for random access to the event_align_collapse files. The results folder that I meant should be COLO829vsColoBL_Nanocompore_results: can you send me the ls -la of that one as well please? Also, how long did you wait before terminating? Last suggestion: can you try running nanocompore again using the option --downsample_high_coverage 5000?

tleonardi commented 4 years ago

Hi @vahidAK, we have just improved logging and solved a couple of bugs. Could you try running Nanocompore again using the version in the branch fix_issue_120? thanks!

matthew-valentine commented 3 years ago

Hi @tleonardi,

Sorry to jump on this but I am having exactly the same issue. Running sampcomp (with mainly default options) I get to 13% completion in about 1.5 hours, but then it just stops progressing, regardless of how long I leave it (longest I let it run was for around two weeks while I was on a break but it was still at 13% when I got back). I've tried running it with the downsampling option as indicated above but the same occurs and it always stops at exactly the same point (118/940 references).

I attached the screenshots you asked Vahid for in case that helps (ls -la in the results folder and the contents of the log file).

Thanks, Matthew

Screenshot 2020-09-28 at 15 24 15

Screenshot 2020-09-28 at 15 24 42

tleonardi commented 3 years ago

Hi @matthew-valentine, sorry about this problem. I have done some major improvements to the logging in Nanocompore. Could you run Nanocompore again using the code in the fix_issue_120 branch and then send me the log file?

cheers tom

matthew-valentine commented 3 years ago

out_SampComp.log

Thanks so much for getting back to me @tleonardi,

I did as you suggested and ran Nanocompore using the fix_issue_120 branch and I attached the log file as requested. It seems to stop at exactly the same point, but the log file does allow me to see what it is attempting to do at that point. There is seemingly something going wrong while it is trying to process ENST00000282388::2:43222401-43226609(-). I had a look at the input files to see what they looked like for this reference vs other successfully processed references and there wasn't anything glaringly obvious. Any tips on what I should be looking out for?

Thanks, Matthew

tleonardi commented 3 years ago

Hi Matthew, great, this is really helpful. Can you try rerunning Nanocompore using the --allow_warnings flag? If you still get the error could you then run it once again with the --log_level debug option?

thanks tom

matthew-valentine commented 3 years ago

Hi again @tleonardi,

Running it with --allow_warnings did the trick, and it completed successfully. I was just wondering if you could say more about the difference between this version of nanocompore and the one in the master branch? The initial reference coverage filtering is vastly different between the two, as with the master version I had 940 references after filtering, but with this version I only get 328. I can see from the log file that all the default values (e.g. for min_coverage) are the same, so I am guessing something has changed under the hood?

Cheers, Matthew

tleonardi commented 3 years ago

Hi @matthew-valentine, great, glad to hear it worked. The version in master has a bug in the routines that do the whitelisting. Essentially, if a transcript is present in one condition but not in the other, the version in master retains it for further analysis anyway. The effect is that when doing the comparison between conditions you get an error due to the fact that one condition doesn't have any data at all. You can have a look at issue #138 for further info. So in reality I think you only have 328 transcripts with enough data in all samples, while 612 (940-328) are only present in some samples but not in all of them.

In any case, I think the problem that you had was unrelated to this. Rather, my guess is that you had some transcripts where the variance between groups was 0. When this happens the stats functions that we use throw a warning to signal that there's something wrong. By default, Nanocompore catches this warning and throws an error. The --allow_warnings option instead makes Nanocompore "accept" the warning and move on. If you look in your log file you should find these warning messages.

tleonardi / nanocompore

Nanocompore stops after 50% of progress #120