faridrashidi / scphylo-tools

a python toolkit for single-cell tumor phylogenetic analysis
https://scphylo-tools.readthedocs.io
BSD 3-Clause "New" or "Revised" License
5 stars 2 forks source link

Recommended Resources to use HUNTRESS / multiple thread usage leads to halt of HUNTRESS ? #10

Open gordonkoehn opened 1 year ago

gordonkoehn commented 1 year ago

Hi Farid,

I am currently making heavy use of your implementation of the HUNTRESS algorithm. tl.huntress

Do you have some guidance on the resources to allocate to HUNTRESS?

I am having some trouble running HUNTRESS. When using a single thread HUNTRESS runs fine for small cell-mutation matrices (200 cells 10 mutations).

I tried upping the threads to 8 and ram to 2 GB in the light of larger mutation matrices (1000 cells, 50 mutations), yet it appears that even for the small runs (200 cells 10 mutations) multiple threads bring trouble - that I don't understand.

HUNTRESS seems to start up fine with the usual first line of output: running HUNTRESS with alpha=1e-06, beta=0.1, n_threads=8

but just does not progress after that for more than 3 hrs, which is way more than the runtime the supplementary paper of HUNTRESS suggests. Surprisingly, I cannot find a high CPU usage related to it. When multiple threads are run no significant resources are used. Tried multiple machines and setups.

Is there a way to check if HUNTRESS is making progress - A debug mode/log ?

Thanks in advance! Best Regards,

Gordon

faridrashidi commented 1 year ago

Hi Gordon,

Do you mind using HUNTRESS on Trisicell package for now? It's the same API. I have to fix its bug on scphylo but haven't had a time to do it.

import trisicell as tsc
tsc.tl.huntress()

I'll promise to fix it this week though and let you know.

Thank you

gordonkoehn commented 1 year ago

Great, will give trisicell a try ! - So it is a known issue with multi-threading / processing ?

Thanks for the prompt and super helpful support !

Gordon

gordonkoehn commented 1 year ago

Great, will give trisicell a try ! - So it is a known issue with multi-threading / processing ?

Thanks for the prompt and super helpful support !

Gordon

faridrashidi commented 1 year ago

No, there is no issue with HUNTRESS. I tried to clean up the code in the scphylo implementation but turned out to be not good. I have to revert it back to the original code.

Sure, anytime. I'll keep you updated.

faridrashidi commented 1 year ago

Hi Gordon,

The bug in scphylo has been fixed, and version 0.0.4 is now available for you to update. You can use HUNTRESS without any issues. Please let me know if there is still any issues. Thank you!

gordonkoehn commented 1 year ago

Great, thank you - updated to v0.0.4 now. Multiprocessing seems to work now. Apologies, for my wording before - I of course just meant the implementation of HUNTRESS not the HUNTRESS Algorithm.

Do you have any guidance on resources and runtimes? mutations m=15 and cells n=250 runs fine - near instant, but already cells n=300 takes 15+ min (haven't) seen it finish, but also sometimes many more cells work. I am a little puzzled about the runtime varying. Is there a way to show if progress is happening?

faridrashidi commented 1 year ago

Hi Gordon, Sure, no worries. Well, the running time is linear with number of cells. Would you mind giving the original code of HUNTRESS a try to see if it's not able to finish in a reasonable time? On 300 cells and 15 mutations must be super fast. It's a bit tricky to show the progress, let me think about it.

gordonkoehn commented 1 year ago

Yes, that's what I expect as well. OK, good idea - will try the original code then.

gordonkoehn commented 1 year ago

I tried to debug a little more, with the updated code of scphylo. The issue seems to persist. A few hundred cells in multi-threading never finish, but finish with a single core.

faridrashidi commented 1 year ago

Hi Gordon, Thank you for the update. Did you also try the code in https://github.com/PASSIONLab/HUNTRESS ? Do you get the same behaviour?

gordonkoehn commented 1 year ago

Not yet, I had a look at it but didn't yet have the time to interface it with my code properly. Will keep you posted when I tried!