Clinical-Genomics / MIP

Mutation Identification Pipeline. Read the latest documentation:
https://clinical-genomics.gitbook.io/project-mip/
MIT License
44 stars 10 forks source link

defining TEST_REFERENCES #1997

Closed nitzankol closed 2 years ago

nitzankol commented 2 years ago

Describe the bug A clear and concise description of what the bug is. I get an error message [FATAL] 2022/07/07 06:04:07 MIP_ANALYSE - Could not find intended vcfanno_config file: TEST_REFERENCES!/grch38_loqusdb_snvindel-2020-03-24-.vcf.gz the file exists in the reference directory I can't find TEST_REFERENCES in my config.yaml

To Reproduce Steps to reproduce the behavior: change the reference dir

Expected behavior can you add a clear documentation where can I define TEST_REFERENCES or change it to point to the reference?

Software version (please complete the following information):

Additional context Add any other context about the problem here.

jemten commented 2 years ago

Hi @nitzankol, Missed thesis issue while I was on vacation. Need a little bit more background. Could you give the command you were trying to run when you got this error?

nitzankol commented 2 years ago

Hi Anders, Thank you very much for your help. I was running a test analysis - command: ~/MIP/mip analyse rd_dna test_MIP --config grch38_mip_rd_dna_config.yaml --pedigree_file mip_pedigree.yaml --cluster_constant_path=~/workspace/ --exome_target_bed target.pdd20_hg38.bed=Father,Mother,Proband --infile_dirs ~/workspace/test_MIP=Father,Mother,Proband where I've changed reference in grch38_mip_rd_dna_config.yaml to point to /root/MIP/t/data/references/ otherwise it's the same as the original file and I got FATAL] 2022/07/07 06:04:07 MIP_ANALYSE - Could not find intended vcfanno_config file: TEST_REFERENCES!/grch38_loqusdb_snvindel-2020-03-24-.vcf.gz

I can't find where to define TEST_REFERENCES. What am I doing wrong? Thank you very much Nitzan

‫בתאריך יום ג׳, 9 באוג׳ 2022 ב-17:28 מאת ‪Anders Jemt‬‏ <‪ @.***‬‏>:‬

Hi @nitzankol https://github.com/nitzankol, Missed thesis issue while I was on vacation. Need a little bit more background. Could you give the command you were trying to run when you got this error?

— Reply to this email directly, view it on GitHub https://github.com/Clinical-Genomics/MIP/issues/1997#issuecomment-1209456246, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZRYJ6435LUYOEIVTMDRVKLVYJTJLANCNFSM524F4KGQ . You are receiving this because you were mentioned.Message ID: @.***>

jemten commented 2 years ago

OK, the files in the t folder is for running the unit tests and integration testing. You will need to download the correct reference files. But first did you manage to install MIP using the supplied install scripts?

If MIP is installed and you have slurm on your HPC you can have MIP download references by running mip download -c templates/mip_download_rd_dna_config_-1.0-.yaml --container_config <path to the container config if you installed with singularity> --reference_genome_versions grch38 ----reference_dir <path to where you want to download references> --environment_name <conda environment name used in installation>

As you probably see by now it is an involved process to set up MIP. So we are currently in the process of rewriting the pipeline in nextflow to make it easier for others to use. You can check it out here. I can try to guide you through the process to set up MIP provided you have a compatible cluster with slurm, conda and singularity or docker. Otherwise I would recommend you to wait a little bit for the nextflow pipeline to be released

cheers, Anders

nitzankol commented 2 years ago

Hi Anders, thanks for your help. I've installed MIP using conda and singularity. I'm not sure we have SLURM - is it mandatory? I installed MIP using singularity as explained in you GIT page. the command you gave seems to run but I don't know what you mean by container_config- where can I find it? Thanks Nitzan

‫בתאריך יום ד׳, 17 באוג׳ 2022 ב-18:15 מאת ‪Anders Jemt‬‏ <‪ @.***‬‏>:‬

OK, the files in the t folder is for running the unit tests and integration testing. You will need to download the correct reference files. But first did you manage to install MIP using the supplied install scripts?

If MIP isn installed and you have slurm on your HPC you can have MIP download refernces by running mip download -c templates/mip_download_rd_dnaconfig-1.0-.yaml --container_config <path to the contianer config if you installed with singularity> --reference_genome_versions grch38 ----reference_dir <path to where you want to download references> --environment_name <conda environtment name used in installation>

As you probably see by now it is an involved process to set up MIP. So we are currently in the process of rewriting the pipeline in nextflow to make it easier for others to use. You can check it out here https://github.com/nf-core/raredisease. I can try to guide you through the process to set up MIP provided you have a compatible cluster with slurm, conda and singularity or docker. Otherwise I would recommend you to wait a little bit for the nextflow pipeline to be released

cheers, Anders

— Reply to this email directly, view it on GitHub https://github.com/Clinical-Genomics/MIP/issues/1997#issuecomment-1218148230, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZRYJ6YRCTAHD2JCLZULW63VZT62ZANCNFSM524F4KGQ . You are receiving this because you were mentioned.Message ID: @.***>

jemten commented 2 years ago

Yeah, SLURM is one of the dependencies. It's the only job scheduler supported by MIP and that is one of the reasons for rewriting it in nextflow. Just realised that there is a bug in MIP and when you install it you need to supply the flag --singularity_local_install. This will save the singularity images to disk in the bin folder of the conda environment that MIP uses. The path to the container config should be <env_name>/bin/templates/mip_container_config.yaml. That file has all the paths to the downloaded containers. So provided you have perl, all the required modules and a conda environment the command should be

perl mip install -c templates/mip_install_config.yaml --envn <conda env> --pipelines rd_dna --reference_dir <path to where you want to have your references> --singularity_local_install This should install the dna pipeline correctly. Run perl mip install --helpfor more options

Before doing any of that though you should try to run the test suite

conda activate <env>
## Goto your mip folder and run (will take a while)
prove t
perl t/mip_install.tets
perl t/mip_download.test
perl t/mip_analyse_rd_dna.test
nitzankol commented 2 years ago

Hi Andres, the tests all run normally. However when I ran your command I get An ERROR: get [FATAL] 2022/08/21 06:44:05 MIP_INSTALL - 'singularity exec --bind /root/MIP/Reference/ensembl-tools-release-unknown/cache:/root/MIP/Reference/ensembl-tools-release-unknown/cache docker://docker.io/ensemblorg/ensembl-vep:release_104.3 INSTALL.pl --AUTO cfp --NO_UPDATE --PLUGINS dbNSFP,ExACpLI,LoFtool,MaxEntScan,SpliceAI --CACHEDIR /root/MIP/Reference/ensembl-tools-release-unknown/cache --SPECIES homo_sapiens_merged --ASSEMBLY GRCh37' exited with value 255

Anyway we don't have SLURM. reading through the documentation again it still seems optional rather than mandatory. maybe you can put a note stating so in the prerequisite part? should I wait for the nextflow version then? Thank you Nitzan

‫בתאריך יום ה׳, 18 באוג׳ 2022 ב-17:57 מאת ‪Anders Jemt‬‏ <‪ @.***‬‏>:‬

Yeah, SLURM is one of the dependencies. It's the only job scheduler supported by MIP and that is one of the reasons for rewriting it in nextflow. Just realised that there is a bug in MIP and when you install it you need to supply the flag --singularity_local_install. This will save the singularity images to disk in the bin folder of the conda environment that MIP uses. The path to the container config should be

/bin/templates/mip_container_config.yaml. That file has all the paths to the downloaded containers. So provided you have perl, all the required modules and a conda environment the command should be perl mip install -c templates/mip_install_config.yaml --envn --pipelines rd_dna --reference_dir --singularity_local_install This should install the dna pipeline correctly. Run perl mip install --helpfor more options Before doing any of that though you should try to run the test suite conda activate ## Goto your mip folder and run (will take a while) prove t perl t/mip_install.tets perl t/mip_download.test perl t/mip_analyse_rd_dna.test — Reply to this email directly, view it on GitHub , or unsubscribe . You are receiving this because you were mentioned.Message ID: ***@***.***>
jemten commented 2 years ago

Hi Nitzan I agree, that should have been more clearly stated. Our goal is to have the first versin of the nextflow pipeline out in October. If you have the time I would recommend you to wait since that should be a more smooth experience than trying to get MIP to work on a non SLURM HPC. cheers, Anders

jemten commented 2 years ago

Closing this for now @nitzankol Feel free to re-open the issue if needed

nitzankol commented 2 years ago

Thanks!

On Tue, Aug 30, 2022, 12:46 Anders Jemt @.***> wrote:

Closing this for now @nitzankol https://github.com/nitzankol Feel free to re-open the issue if needed

— Reply to this email directly, view it on GitHub https://github.com/Clinical-Genomics/MIP/issues/1997#issuecomment-1231429597, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZRYJ62X5EUWKVFE4TV3MADV3XJ6XANCNFSM524F4KGQ . You are receiving this because you were mentioned.Message ID: @.***>