arq5x / gemini

a lightweight db framework for exploring genetic variation.
http://gemini.readthedocs.org
MIT License
318 stars 120 forks source link

Compound Hets issue #927

Closed Phillip-a-richmond closed 5 years ago

Phillip-a-richmond commented 5 years ago

I'm getting this error when running compound het models:

Traceback (most recent call last): File "/mnt/causes-vnx1/DATABASES/GEMINI-2019/bin/gemini", line 7, in gemini_main.main() File "/mnt/causes-vnx1/DATABASES/GEMINI-2019/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 1249, in main args.func(parser, args) File "/mnt/causes-vnx1/DATABASES/GEMINI-2019/anaconda/lib/python2.7/site-packages/gemini/gemini_main.py", line 711, in comp_hets_fn CompoundHet(args).run() File "/mnt/causes-vnx1/DATABASES/GEMINI-2019/anaconda/lib/python2.7/site-packages/gemini/gim.py", line 307, in run for i, s in enumerate(self.report_candidates()): File "/mnt/causes-vnx1/DATABASES/GEMINI-2019/anaconda/lib/python2.7/site-packages/gemini/gim.py", line 213, in report_candidates for gene, li in self.candidates(): File "/mnt/causes-vnx1/DATABASES/GEMINI-2019/anaconda/lib/python2.7/site-packages/gemini/gim.py", line 464, in candidates gt_types, gt_bases, gt_phases = row['gt_types'], row['gts'], row['gt_phases'] File "/mnt/causes-vnx1/DATABASES/GEMINI-2019/anaconda/lib/python2.7/site-packages/gemini/GeminiQuery.py", line 449, in getitem self.cache[key] = self.unpack(self.row[key]) File "/mnt/causes-vnx1/DATABASES/GEMINI-2019/anaconda/lib/python2.7/site-packages/gemini/compression.py", line 99, in snappy_unpack_blob arr.setflags(write=True) ValueError: cannot set WRITEABLE flag to True of this array

This is my command:

Compound Het Variants

$GEMINI comp_hets \ --columns "$COLUMNS" \ --filter "$GNOMAD_GENOME_RARE AND $GNOMAD_EXOME_RARE AND $CONFIDENTREGION AND $SEGDUP AND $INHOUSE_RARE AND ($IMPACT_HIGH OR $IMPACT_MED) AND $FILTER"\ -d $STRICT_MIN_DP \ --min-gq $STRICT_MIN_GQ \ $GEMINIDB > $COMPOUND_HET_OUT python $TableAnnotator -i $COMPOUND_HET_OUT -o ${COMPOUND_HET_OUT}_annotated.txt

My pipeline is: From Merged.vcf SNPeff (GRCh37.75) / VT bgzip/tabix BCFTools VCFAnno VCF2DB gemini

With the old gemini it works just fine, but with the update this issue arises. I can provide the database directly if it's not on GitHub (~200M for chr20 of NA12878 trio). Is this because there are no candidate compound het variants? Or is it perhaps because something has changed with the gemini load command specific to gene files that is different than my pipeline?

Thanks, Phil

brentp commented 5 years ago

it looks like you can resolve this with:

pip install numpy==1.15.4

or whatever your conda equivalent of that might be.

see: https://github.com/pandas-dev/pandas/issues/24839

Phillip-a-richmond commented 5 years ago

Fixed.

Thanks.

snashraf commented 3 years ago

Hi Philip, Are you getting this issue with only vcf2db generated databases or even with Gemini generated DB as well? Regards, Najeeb