WGLab / PennCNV

Copy number vaiation detection from SNP arrays
http://penncnv.openbioinformatics.org
Other
87 stars 53 forks source link

compile_pfb.pl running time issue #120

Open zuyan413 opened 1 month ago

zuyan413 commented 1 month ago

When I am using 960 split files, it's getting extremely slow (taking around 30 hrs). but when I am reducing the number of split files to 400, it is running fast. command: time mpirun -np 72 --oversubscribe perl compile_pfb.pl --listfile path_to_split_file.txt -output output_batch3_2.pfb

kaichop commented 1 month ago

This could be due to memory issues when virtual memory is needed to complete the task with many files, or could be just a file system issue to support simultaneous reading of 960 files. In any case, 400 files are already enough to compile a PFB.

On Thu, Jun 6, 2024 at 2:55 AM zuyan413 @.***> wrote:

When I am using 960 split files, it's getting extremely slow (taking around 30 hrs). but when I am reducing the number of split files to 400, it is running fast. command: time mpirun -np 72 --oversubscribe perl compile_pfb.pl --listfile path_to_split_file.txt -output output_batch3_2.pfb

— Reply to this email directly, view it on GitHub https://github.com/WGLab/PennCNV/issues/120, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3ODW6DBSSDYTBCJK7PDZGABWRAVCNFSM6AAAAABI4C4UPWVHI2DSMVQWIX3LMV43ASLTON2WKOZSGMZTONBYGE3DKNA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

zuyan413 commented 4 weeks ago

Thankyou. So, if I have 10,000 samples and I run it in batches of 960, then do I have to generate pfb file for each batch separately or one pfb file is okay for all 10k samples? all 10k samples are from same population

kaichop commented 4 weeks ago

one PFB is okay for all 10k samples.

On Tue, Jun 11, 2024 at 2:54 AM zuyan413 @.***> wrote:

Thankyou. So, if I have 10,000 samples and I run it in batches of 960, then do I have to generate pfb file for each batch separately or one pfb file is okay for all 10k samples? all 10k samples are from same population

— Reply to this email directly, view it on GitHub https://github.com/WGLab/PennCNV/issues/120#issuecomment-2159934039, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABNG3OFO7QLIHMJGMZCLZPLZG2NLJAVCNFSM6AAAAABI4C4UPWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNJZHEZTIMBTHE . You are receiving this because you commented.Message ID: @.***>