randel / EnsDeconv

Ensemble Deconvolution to robustly estimate cellular fractions from bulk omics data
Other
6 stars 1 forks source link

CPU Use with some method combinations always 100% of all available threads, regardless of ncore setting #2

Closed edammer closed 2 years ago

edammer commented 2 years ago

image

At and after method combination 62 of 150 using the test data for 3 scRNA brain 5 cell type data sets provided as testdata, CPU use goes to 100% on all available threads regardless of the number of threads set for use with the ncore= parameter (8 was used here, which was adhered to until this point). This behavior has been confirmed on linux R Studio Server v2022.02.3 build 492 under CentOS 7 (shown above), and R 4.0.2 console under Windows 10 (32 threads total with ncore=4 specified) using more than one different proteomic bulk count data set.

At method 135/150, the CPU use returns to 8 threads: image

manqicai commented 2 years ago

@edammer Hi Eric, thank you for the updates. I will get back to you during this weekend.

edammer commented 2 years ago

@manqicai I am more concerned about the other open issue, missing value handling, as this appears to affect how many combinations complete successfully and contribute to an ensemble. A possible separate issue is it seems possible that marker selection may select some genes that do not make logical sense, based on pulling out the markers used from the ensemble output list and checking them in the Wyss-Coray data visualization shiny app. There could be a number of underlying causes or explanations for this. The snRNA data for the 14 cell types is accessible as an online GUI: https://twc-stanford.shinyapps.io/human_bbb/

Regardless, I appreciate you looking into the issue and will continue trying each of the methods you implemented in isolation and as an ensemble in EnsDeconv on human proteomic bulk brain data.

manqicai commented 2 years ago

@edammer Hi Eric, sorry for the waiting! I tried your screenshot code. But I don't have any clear answer for this right now. Would you mind telling me what is the 62 scenario for your case here? Besides, if you think it would be helpful, I would be more than happy to schedule a zoom meeting with you to try to solve your issues. For the markers wise part, it might be due to the number of markers you use. We are currently looking into cell type specific markers more closely to try to refine this step in EnsDeconv.