resources config files added for F16, F32, F72, and M64

zhuchcn commented 2 years ago

@lydiayliu Is 10 GB enough for parsers?

[X] I have read the code review guidelines and the code review best practice on GitHub check-list.
[X] The name of the branch is meaningful and well formatted following the standards, using [AD_username (or 5 letters of AD if AD is too long)-[brief_description_of_branch].
[X] I have set up or verified the branch protection rule following the github standards before opening this pull request.
[X] I have added my name to the contributors listings in the metadata.yaml and the manifest block in the nextflow.config as part of this pull request, am listed already, or do not wish to be listed. (This acknowledgement is optional.)
[ ] I have added the changes included in this pull request to the CHANGELOG.md under the next release version or unreleased, and updated the date.
[ ] I have updated the version number in the metadata.yaml and manifest block of the nextflow.config file following semver, or the version number has already been updated. (Leave it unchecked if you are unsure about new version number and discuss it with the infrastructure team in this PR.)
[X] All test cases have passed.

Closes #73

lydiayliu commented 2 years ago

Is 10 GB enough for parsers?

I don't think so. Everything that needs to load the index should get at least 15G

I gathered the logs from CCLE:

/hot/project/algorithm/moPepGen/CCLE/processed/noncanonical-database/call-nonCanonicalPeptide/GRCh38-EBI-GENCODE34/2022-05-30/pipeline-meta-call-NonCanonicalPeptide-0.0.1/call_parsers.trace.txt
/hot/project/algorithm/moPepGen/CCLE/processed/noncanonical-database/call-nonCanonicalPeptide/GRCh38-EBI-GENCODE34/2022-05-30/pipeline-meta-call-NonCanonicalPeptide-0.0.1/call_variant.trace.txt

It's on the conservative side but everything was like ~11.1G. I agree with giving callVariant 30G though for the worse case scenario. I don't know if I want to give callVariant more than that cuz it signals an issue.

lydiayliu commented 2 years ago

Why don't we take the opportunity to add the values for the rest of pipeline steps?

split_fasta: /hot/project/algorithm/moPepGen/CCLE/processed/noncanonical-database/call-nonCanonicalPeptide/GRCh38-EBI-GENCODE34/2022-05-30_merge_split/pipeline-meta-call-NonCanonicalPeptide-0.0.1/split_fasta.trace.txt

14G

merge_fasta: /hot/project/algorithm/moPepGen/CCLE/processed/noncanonical-database/call-nonCanonicalPeptide/GRCh38-EBI-GENCODE34/2022-05-30_merge_split/pipeline-meta-call-NonCanonicalPeptide-0.0.1/merge_fasta.trace.txt

8G

filter_fasta: /hot/project/algorithm/moPepGen/CCLE/processed/noncanonical-database/call-nonCanonicalPeptide/GRCh38-EBI-GENCODE34/2022-05-30_filter_split_0001_DONT_USE/pipeline-meta-call-NonCanonicalPeptide-0.0.1/filter_fasta.trace.txt

6G

decoy_fasta: /hot/project/algorithm/moPepGen/CCLE/processed/noncanonical-database/call-nonCanonicalPeptide/GRCh38-EBI-GENCODE34/2022-05-30_merge_split/pipeline-meta-call-NonCanonicalPeptide-0.0.1/decoy_fasta.trace.txt

8G

encode_fasta: /hot/project/algorithm/moPepGen/CCLE/processed/noncanonical-database/call-nonCanonicalPeptide/GRCh38-EBI-GENCODE34/2022-05-30_merge_split/pipeline-meta-call-NonCanonicalPeptide-0.0.1/encode_fasta.trace.txt

4G

zhuchcn commented 2 years ago

That's great! Really appreciate gathering all the information! A little surprising to see how big the memory usage is for the processes that I thought are low memory. For the recommended memory values you gave, are they the worst case? A little confusing here because in this file (/hot/project/algorithm/moPepGen/CCLE/processed/noncanonical-database/call-nonCanonicalPeptide/GRCh38-EBI-GENCODE34/2022-05-30_merge_split/pipeline-meta-call-NonCanonicalPeptide-0.0.1/split_fasta.trace.txt), the worst case I can see seems to be 9.6 GB for rss and 10.7 GB for vmem. Did you just add some extra number to make it safe?

lydiayliu commented 2 years ago

Feel free to adjust! I just added some extra numbers and rounded to an even number lol, no real reason. But CCLE is definitely a conservative estimate considering there are very few mutations. I just don't have the numbers on CPCG

zhuchcn commented 2 years ago

Just modified the config files according to your recommendation.

uclahs-cds / pipeline-call-NonCanonicalPeptide

resources config files added for F16, F32, F72, and M64 #74