PMBio / deeprvat

Other
31 stars 2 forks source link

Variants file not passed to association_dataset in seedgenediscovery pipeline #69

Closed Jonas-B-Frank closed 4 months ago

Jonas-B-Frank commented 5 months ago

I am running the seed_gene_discovery pipeline with data being processed by the preprocessing and annotations pipelines beforehand. I created the phenotypes.parquet, having the same order of samples as in the genotypes.h5 file

Running the pipeline, I get an error in rule association_dataset:

Traceback (most recent call last): File "PATH/miniconda3/envs/deeprvat/bin/seed_gene_pipeline", line 33, in <module> sys.exit(load_entry_point('deeprvat', 'console_scripts', 'seed_gene_pipeline')()) File "PATH/miniconda3/envs/deeprvat/lib/python3.8/site-packages/click/core.py", line 1157, in __call__ return self.main(*args, **kwargs) File "PATH/miniconda3/envs/deeprvat/lib/python3.8/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "PATH/miniconda3/envs/deeprvat/lib/python3.8/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "PATH/miniconda3/envs/deeprvat/lib/python3.8/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "PATH/miniconda3/envs/deeprvat/lib/python3.8/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) File "PATH/deeprvat/seed_gene_discovery/seed_gene_discovery.py", line 591, in make_dataset _, ds = make_dataset_( File "PATH/deeprvat/seed_gene_discovery/seed_gene_discovery.py", line 543, in make_dataset_ dataset = DenseGTDataset( File "PATH/deeprvat/data/dense_gt.py", line 141, in __init__ raise ValueError("variant_file must be specified") ValueError: variant_file must be specified

I think the error is related to the call of dataset = DenseGTDataset( gt_file=data_config["gt_file"], skip_y_na=True, skip_x_na=True, **data_config["dataset_config"], ) Using data_config["dataset_config"], the variant_file variable, which is specified beforehand in the exemplary config file, is not passed anymore to the function.

Adding variant_file=data_config["variant_file"], solved the issue for me.

HolEv commented 4 months ago

thank you for bringing this up. This was fixed here #104