statgen / pheweb

A tool to build a website to browse hundreds or thousands of GWAS.
MIT License
154 stars 65 forks source link

"generator raised StopIteration" in `sbatch slurm-parse.sh` #142

Closed Shicheng-Guo closed 3 years ago

Shicheng-Guo commented 3 years ago

Hi Peter,

sbatch /pheweb/generated-by-pheweb/tmp/slurm-parse-2020-07-31T07-02-25.428873.sh

In this job, 77 association summary statistics were included with 75 succeeded while 2 failed. Here is the errors for these 2 failed job. Do you have any suggestion?

(base) [sguo2@comet-ln3 tmp]$ less /projects/ps-janssen4/dsci-csb/user/sguo2/pheweb/generated-by-pheweb/tmp/parse-failures.json
{
 "categorical_41245_1820_": {
  "exception_str": "generator raised StopIteration",
  "exception_tb": "Traceback (most recent call last):\n  File \"/home/sguo2/bin/miniconda3/lib/python3.7/site-packages/pheweb/load/read_input_file.py\", line 101, in get_variants\n    colnames = [colname.strip('\"\\' ').lower() for colname in next(f).rstrip('\\n\\r').split('\\t')]\nStopIteration\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n  File \"/home/sguo2/bin/miniconda3/lib/python3.7/site-packages/pheweb/load/parse_input_files.py\", line 63, in convert\n    pheno_reader = PhenoReader(pheno, minimum_maf=conf.assoc_min_maf)\n  File \"/home/sguo2/bin/miniconda3/lib/python3.7/site-packages/pheweb/load/read_input_file.py\", line 24, in __init__\n    self.fields, self.filepaths = self._get_fields_and_filepaths(pheno['assoc_files'])\n  File \"/home/sguo2/bin/miniconda3/lib/python3.7/site-packages/pheweb/load/read_input_file.py\", line 62, in _get_fields_and_filepaths\n    v = next(ar.get_variants())\nRuntimeError: generator raised StopIteration\n",
  "succeeded": false
 },
 "categorical_41246_2510_": {
  "exception_str": "generator raised StopIteration",
  "exception_tb": "Traceback (most recent call last):\n  File \"/home/sguo2/bin/miniconda3/lib/python3.7/site-packages/pheweb/load/read_input_file.py\", line 101, in get_variants\n    colnames = [colname.strip('\"\\' ').lower() for colname in next(f).rstrip('\\n\\r').split('\\t')]\nStopIteration\n\nThe above exception was the direct cause of the following exception:\n\nTraceback (most recent call last):\n  File \"/home/sguo2/bin/miniconda3/lib/python3.7/site-packages/pheweb/load/parse_input_files.py\", line 63, in convert\n    pheno_reader = PhenoReader(pheno, minimum_maf=conf.assoc_min_maf)\n  File \"/home/sguo2/bin/miniconda3/lib/python3.7/site-packages/pheweb/load/read_input_file.py\", line 24, in __init__\n    self.fields, self.filepaths = self._get_fields_and_filepaths(pheno['assoc_files'])\n  File \"/home/sguo2/bin/miniconda3/lib/python3.7/site-packages/pheweb/load/read_input_file.py\", line 62, in _get_fields_and_filepaths\n    v = next(ar.get_variants())\nRuntimeError: generator raised StopIteration\n",
  "succeeded": false
 }
}
pjvandehaar commented 3 years ago

Somehow it was trying to read an empty file. What were the assoc-files in pheno-list.json for the failed phenotypes, as well as for a couple succeesful ones for comparison? Are they on a different mountpoint that might not have been mounted on the worker machines where they ran?

pjvandehaar commented 3 years ago

Pheweb 1.1.24 should give a better error message in this situation. Try upgrading and running it again.

Shicheng-Guo commented 3 years ago

Hi Peter,

I checked this sbatch file and find it is a regular sbatch job. Do you think I can set --nodes, for example as the following:

#SBATCH --nodes 2
#SBATCH --ntasks-per-node=100

Thanks.

Shicheng

pjvandehaar commented 3 years ago

After the first attempted parse fails, you should be able to just re-run pheweb parse and it'll see that 75 files are already done and it only needs to do 2. I haven't loosed at SLURM docs in a while to check the argument name, but you have the right approach.