Open IamksGEEK opened 1 year ago
Yes, I totally concur with your viewpoint. I attempted to avoid using the 'foreach' function and packages associated with parallel processing. Instead, I aimed to divide the Rcode into several sections and manually execute them in parallel. This way, there would not be a high demand for threads and the execution speed would increase. This method proved to be beneficial as I previously failed with a HPC node that had fewer cores and memory than yours when running the original code.
Hi, Sorry to bother, but the computation time of step i05 is huge, it may take several month to count variant for larger sample size like 400. In fact , it took me more than 5 hours to count variant for one sample in a HPC node which have 64 core and 1 T memory, and the job used all thread of the node and half of the memory. So, if there are any possible way to further reduce the computation time . maybe , if I could divided the gnomAD database into 2.5 Mb windows? please give me some suggestion to make the algorithm faster, or tell me if i doing somthing wrong. By the way , we found the concept of regional difference is pretty useful, the accuray of cervical carcinoma prediction model which was build based on regional difference was 95%. kongshuang