arq5x / gemini

a lightweight db framework for exploring genetic variation.
http://gemini.readthedocs.org
MIT License
318 stars 120 forks source link

bash master.sh window.t0X errors #345

Closed nicholasblackburn closed 9 years ago

nicholasblackburn commented 9 years ago

Hi there,

I've seen this cross posted a few places without response and have also put it on the google group.

Haven't been able to get gemini to work yet at all.

Ran gemini update, works fine. Ran bash master-test.sh (results below)

Fails at query.t22...\c and then again for all window.t0X...\c tests.

Regards,

Nick

Nicholass-MacBook-Pro:gemini-master Nicholas$ gemini update /usr/local/share/gemini/anaconda/bin/conda Fetching package metadata: .. Solving package specifications: . Package plan for installation in environment /usr/local/share/gemini/anaconda:

The following packages will be UPDATED:

numpy: 1.8.2-py27_0 --> 1.9.0-py27_0

Unlinking packages ... [ COMPLETE ] |##############################################################################################################################################################################| 100% Linking packages ... [ COMPLETE ] |##############################################################################################################################################################################| 100% Fetching package metadata: ... Solving package specifications: ............... Package plan for installation in environment /usr/local/share/gemini/anaconda:

The following packages will be DOWNGRADED:

numpy: 1.9.0-py27_0 --> 1.8.2-py27_0

Unlinking packages ... [ COMPLETE ] |##############################################################################################################################################################################| 100% Linking packages ... [ COMPLETE ] |##############################################################################################################################################################################| 100% Requirement already satisfied (use --upgrade to upgrade): numpy>=1.7.1 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 1)) Requirement already satisfied (use --upgrade to upgrade): pyparsing>=1.5.6,<=1.5.7 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 2)) Requirement already satisfied (use --upgrade to upgrade): pysam>=0.6 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 3)) Requirement already satisfied (use --upgrade to upgrade): cyvcf>=0.1.10 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 4)) Requirement already satisfied (use --upgrade to upgrade): PyYAML>=3.10 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 5)) Requirement already satisfied (use --upgrade to upgrade): pybedtools>=0.6.2 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 6)) Requirement already satisfied (use --upgrade to upgrade): jinja2>=2.7.1 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 7)) Requirement already satisfied (use --upgrade to upgrade): python-graph-core>=1.8.2 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 8)) Requirement already satisfied (use --upgrade to upgrade): python-graph-dot>=1.8.2 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 9)) Requirement already satisfied (use --upgrade to upgrade): bottle>=0.11.6 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 10)) Requirement already satisfied (use --upgrade to upgrade): ipython-cluster-helper>=0.2.23 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 11)) Requirement already satisfied (use --upgrade to upgrade): bx-python>=0.7.1 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 12)) Requirement already satisfied (use --upgrade to upgrade): pandas>=0.11.0 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 13)) Requirement already satisfied (use --upgrade to upgrade): openpyxl>=1.6.1,<2.0.0 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 14)) Requirement already satisfied (use --upgrade to upgrade): scipy>=0.12.0 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 15)) Requirement already satisfied (use --upgrade to upgrade): Unidecode>=0.04.14 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 16)) Downloading/unpacking gemini==0.10.1 (from -r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 17)) Downloading gemini-0.10.1.tar.gz (15.9MB): 15.9MB downloaded Running setup.py (path:/private/var/folders/7g/nqfggqvj7p13bfny9n2krt5c0000gn/T/pip_build_Nicholas/gemini/setup.py) egg_info for package gemini

Requirement already satisfied (use --upgrade to upgrade): pydot in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from python-graph-dot>=1.8.2->-r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 9)) Requirement already satisfied (use --upgrade to upgrade): pyzmq>=2.1.11 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from ipython-cluster-helper>=0.2.23->-r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 11)) Requirement already satisfied (use --upgrade to upgrade): ipython>=1.1.0 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from ipython-cluster-helper>=0.2.23->-r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 11)) Requirement already satisfied (use --upgrade to upgrade): netifaces>=0.10.3 in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages (from ipython-cluster-helper>=0.2.23->-r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 11)) Requirement already satisfied (use --upgrade to upgrade): setuptools in /usr/local/share/gemini/anaconda/lib/python2.7/site-packages/setuptools-7.0-py2.7.egg (from pydot->python-graph-dot>=1.8.2->-r https://raw.githubusercontent.com/arq5x/gemini/master/requirements.txt (line 9)) Installing collected packages: gemini Found existing installation: gemini 0.11.0a Uninstalling gemini: Successfully uninstalled gemini Running setup.py install for gemini

changing mode of build/scripts-2.7/gemini from 644 to 755
changing mode of /usr/local/share/gemini/anaconda/bin/gemini to 755

Successfully installed gemini Cleaning up... Gemini upgraded to latest version Checking required dependencies... curl found

Gemini data files updated remote: Counting objects: 6, done. remote: Compressing objects: 100% (6/6), done. remote: Total 6 (delta 1), reused 0 (delta 0) Unpacking objects: 100% (6/6), done. From https://github.com/arq5x/gemini

arq5x commented 9 years ago

Sorry for the trouble. This is odd, as I am not able to reproduce it. It looks as though possibly bedtools is not installed on your system though it should be automatically installed by the gemini installer.

Can you check that you have bedtools installed?

nicholasblackburn commented 9 years ago

Thanks for responding, sure do: Nicholass-MacBook-Pro:gemini Nicholas$ bedtools --version bedtools v2.17.0

Nicholass-MacBook-Pro:gemini Nicholas$ bedtools bedtools: flexible tools for genome arithmetic and DNA sequence analysis. usage: bedtools [options]

The bedtools sub-commands include:

[ Genome arithmetic ] intersect Find overlapping intervals in various ways. window Find overlapping intervals within a window around an interval. closest Find the closest, potentially non-overlapping interval. coverage Compute the coverage over defined intervals. map Apply a function to a column for each overlapping interval. genomecov Compute the coverage over an entire genome. merge Combine overlapping/nearby intervals into a single interval. cluster Cluster (but don't merge) overlapping/nearby intervals. complement Extract intervals not represented by an interval file. subtract Remove intervals based on overlaps b/w two files. slop Adjust the size of intervals. flank Create new intervals from the flanks of existing intervals. sort Order the intervals in a file. random Generate random intervals in a genome. shuffle Randomly redistrubute intervals in a genome. annotate Annotate coverage of features from multiple files.

[ Multi-way file comparisons ] multiinter Identifies common intervals among multiple interval files. unionbedg Combines coverage intervals from multiple BEDGRAPH files.

[ Paired-end manipulation ] pairtobed Find pairs that overlap intervals in various ways. pairtopair Find pairs that overlap other pairs in various ways.

[ Format conversion ] bamtobed Convert BAM alignments to BED (& other) formats. bedtobam Convert intervals to BAM records. bamtofastq Convert BAM records to FASTQ records. bedpetobam Convert BEDPE intervals to BAM records. bed12tobed6 Breaks BED12 intervals into discrete BED6 intervals.

[ Fasta manipulation ] getfasta Use intervals to extract sequences from a FASTA file. maskfasta Use intervals to mask sequences from a FASTA file. nuc Profile the nucleotide content of intervals in a FASTA file.

[ BAM focused tools ] multicov Counts coverage from multiple BAMs at specific intervals. tag Tag BAM alignments based on overlaps with interval files.

[ Statistics tools ] jaccard Calculates the Jaccard statistic b/w two sets of intervals.

[ Miscellaneous tools ] overlap Computes the amount of overlap from two intervals. igv Create an IGV snapshot batch script. links Create a HTML page of links to UCSC locations. makewindows Make interval "windows" across a genome. groupby Group by common cols. & summarize oth. cols. (~ SQL "groupBy") expand Replicate lines based on lists of values in columns.

[ General help ] --help Print this help menu. --version What version of bedtools are you using?. --contact Feature requests, bugs, mailing lists, etc.

nicholasblackburn commented 9 years ago

I've tried on two VMs on the cluster that I have access to, using the install without root access approach. Same result.

arq5x commented 9 years ago

Okay, I can now replicate this after upgrading pybedtools on my system to version 0.6.8, which would be the version of pybedtools that would be automatically installed by the recent installations of gemini. This looks like it might be a new bug in pybedtools that is causing this. We will have to explore the cause. In the interim, I think you can safely ignore the errors and use GEMINI so long as you avoid the window tool.

cc @udp3f

nicholasblackburn commented 9 years ago

Awesome, thank you. So definitely shouldn't affect vcf file load?

(also, basically I've just got 96K variants VCF pre-filtered, i just want to add the ENCODE annotations, do I need to do VEP?)

arq5x commented 9 years ago

This will not affect the load at all. If you just want ENCODE annotations, you can skip VEP. Just know that doing so will also prevent having gene annotations.

nicholasblackburn commented 9 years ago

Thank you Aaron! Shall let you know if I hit problems, this is end stage of the analysis post ANNOVAR filtering, when I don't have a deadline looming i'll redo everything with GEMINI. Is best place to contact here?

arq5x commented 9 years ago

Sure, here is fine or the GEMINI mailing list.

arq5x commented 9 years ago

I just pushed a fix for this. All of the window tests should pass if you do:

$ gemini update --devel
nicholasblackburn commented 9 years ago

Works perfectly now - thanks!