Closed lydiayliu closed 2 years ago
honestly i feel the same way about fusions for no reason. CPU usage just hovers around 200% (I gave 16 threads). I don't think the runs are slower than before, but they are just as slow...
before
/hot/users/yiyangliu/MoPepGen/Variant/Fusion/fusioncatcher-1.33/variant.ensembl.winu.3f.log
now (still running the last few)
/hot/users/yiyangliu/MoPepGen/Variant/Fusion/fusioncatcher-1.33/variant.ensembl.s.nc.log
actually nvm, it is slow but still an improvement from before! the 200% CPU part is probably because a few fusions are much more complex than the others and all the other threads are waiting for them
I also ran the circRNA case, seems like it got stuck at ENST00000484888.5. I agree that the traceback isn't so user-friendly as before but there is limited that I can do.
And for parallelization, since the order doesn't matter any more, maybe we can sort the transcripts according to the complexity so the ones that take long time will all be processed together.
I'm not sure if parallelization is working for circRNAs... The reason is that I have been running a sample for the better part of today and it hasn't budged... I'm doing this on a single node using a single process. I tried both 16 and 32 threads.
killing the process gets this, also the CPU usage from the docker just hovers around 100%
the entire log just looks like this (this sample used to run in less than 20 minutes prior to GVF indexing and multi-process), it's been quite a few hours
however, I did get CPCG0100 to run through, and it only took a little longer than before so idk...
CPCG0100 old log prior to GVF indexing and multi-process:
I'm gonna try with 12 threads and verbose 2 now