ntm / grexome-TIMC-Secondary

exome pipeline from TIMC - secondary analyses (GVCF to analysis-ready TSVs)
GNU General Public License v3.0
3 stars 2 forks source link

Final result file folders Empty ----E 8_extractTranscripts.pl: couldn't find one of HV/HET/OCHV/OCHET for OM #5

Closed Chris-lang478 closed 2 years ago

Chris-lang478 commented 2 years ago

@ntm I tried the vep command as in 3_runVEP.pl by using a vcf file choped from the whole as the pl runs. And it got result files. So I am really confused about the empty final result. How can I find the wrong step? Thanks again for your help!

test.zip vepStats.zip

Here is log file. So the wrong because of 8_extractSamples.pl? I just use the example sample.xls and modified sample name with original colunm name. sampleCLN.zip

During the process, the tmpdir is not empty.

image image

I 2022-10-10 05:12:13: 8_extractSamples.pl - starting to run Use of uninitialized value $header in scalar chomp at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/8_extractSamples.pl line 161. Use of uninitialized value $header in split at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/8_extractSamples.pl line 162. E: 8_extractSamples.pl - couldn't find OMF_HV or OMF_OTHERCAUSE_HV in header of infile OMF.csv I 2022-10-10 05:12:14: 8_extractSamples.pl - ALL DONE, completed successfully! I 2022-10-10 05:12:15: 8_extractTranscripts.pl - starting to run Use of uninitialized value $header in scalar chomp at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/8_extractTranscripts.pl line 213. Use of uninitialized value $header in split at /home/data/zlsz_01/CLN_dir/grexomePIP/grexome-TIMC-Secondary/8_extractTranscripts.pl line 214. E 8_extractTranscripts.pl: couldn't find one of HV/HET/OCHV/OCHET for OM

Thanks again for your help! Best wishes, Chris

ntm commented 2 years ago

The "E: 8_extractSamples.pl" error occurs because that script received an empty file as input. I fixed the code to produce a clear error message instead of the two obscure "Use of uninitialized value $header..." messages, but this is just cosmetic. Logging "8_extractSamples.pl - ALL DONE, completed successfully!" instead of aborting with an error message when the step actually failed is also a mistake, it has now been fixed. The same goes for 8_extractTranscripts.pl, and I fixed it as well. But your real issue happens upstream from these steps - they should not be receiving empty files as input, some step upstream must have failed. Your sampleCLN.xlsx looks fine, and if it wasn't you would get early error messages about it: all metadata files get thoroughly sanity-checked before anything else happens. Please do the following:

  1. git pull to refresh your codebase. You should not make changes to any files from the repo, the only customizations should be in a copy of grexomeTIMCsec_config.pm (copied somewhere and edited). For example if you don't use subCohorts you should just modify &subCohorts() in your copy of grexomeTIMCsec_config.pm so it returns an empty hash.
  2. Run grexome-TIMC-secondary.pl
  3. If the logfile still contains E or W lines, and/or if the results are still empty, please post the full command-line that you used and attach the full logfile.
  4. Also please post the result of zgrep -m 1 '#CHROM' [your GVCF.gz file] (it should end with the sample IDs, including at least all the sampleID values in sampleCLN.xlsx , I thought the code checked this but your test VEP output from test.zip doesn't have anything after INFO, seems strange)
Chris-lang478 commented 2 years ago

@ntm Thanks for your great work!! It runs successfully after refreshing the git.

Cheers, Chris