AndersenLab / VCF-kit

VCF-kit: Assorted utilities for the variant call format
http://www.andersenlab.org
MIT License
122 stars 25 forks source link

something wrong installing with anaconda #26

Closed Haoyuan17 closed 4 years ago

Haoyuan17 commented 4 years ago

Hello,

I use this to install with anaconda: conda config --add channels bioconda conda create -n vcf-kit python=2.7 vcfkit

And the error is:

UnsatisfiableError: The following specifications were found to be incompatible with each other Output in format: Requested package -> Available versions Package python conflicts for: vcfkit -> biopython -> python[version='3.4.|3.5.|3.6.|>=3.8,<3.9.0a0|>=3.7,<3.8.0a0|>=3.6,<3.7.0a0|>=3.5,<3.6.0a0'] vcfkit -> python[version='2.7.|>=2.7,<2.8.0a0'] python=2.7

Haoyuan17 commented 4 years ago

I find a biopython=1.69 version and then install vcfkit. Still not work. Error:

UnsatisfiableError: The following specifications were found to be incompatible with the existing python installation in your environment: Specifications:

Haoyuan17 commented 4 years ago

I have solved the problem and it seems working well. The solution: conda create -n vcf-kit python=2.7 conda install matplotlib=2.2.5
conda install yahmm=1.1.2 pip install numpy==1.16.0 pip install VCF-kit

After this, find the "anaconda3/envs/vcf-kit/lib/python2.7/site-packages/vcfkit/utils/vcf.py" change line 11 "np.set_printoptions(threshold=np.nan)" to "np.set_printoptions(threshold=10000000000000)"

And it works. Because i cannot install vcfkit with conda(I don't know why.), I create a env of python2.7 with conda and then install the package which vcfkit need as the auther says. Maybe the numpy version or some other package version is different from the auther use, I have to change the code slightly. I believe it is a numpy version issue.

danielecook commented 4 years ago

@haoyuan17 I recommend installing vcf-kit in it's own environment.

I think you can simply do:

condo create -n vcf-kit vcf-kit
Haoyuan17 commented 4 years ago

@danielecook At first I have used conda to install with the code: conda config --add channels bioconda conda create -n vcf-kit python=2.7 vcfkit

But there is a error as following:

截屏2020-06-15 上午10 09 02
danielecook commented 4 years ago

@Haoyuan17 what tools are you looking to use within vcf-kit? I am working on a new version that is compiled here: https://www.github.com/danielecook/seq-collection

Do you have docker installed? Python2 is being deprecated it looks like, so getting vcf-kit to work with Python3 will take some time.

One nice alternative you can try is running your analysis in a docker container.

docker pull ezequieljsosa/vcfkit
Haoyuan17 commented 4 years ago

@danielecook Thanks a lot. My python version is 2.7 and I don't why I could not install vcf-kit easily. My friend could install in a few minutes, so it is the problem with my pc.

Actually I could use the vcf-kit now with a complicated process. I need the newick file and vcf-kit help me convert vcf to newick. The "vk phylo" command is good enough except for the large vcf file(I have a 50GB vcf file).

Do you have any advice on this issue: using "vk phylo" for large file?

I noticed that "seq-collection" could convert VCF to JSON. I am not familiar with JSON. Could I use JSON to get a newick file? How the JSON works with large file?

Haoyuan17 commented 4 years ago

@danielecook I am writing to say the vcf-kit is really good. I mentioned about the large vcf file issue. It is not an issue. At first, I believe that I cannot run large file with my pc with a normal size RAM. I am wrong. The "vk phylo" command works well with the 50GB vcf.

When I use MEGA, online tools or other software, it is hard to get a fasta from vcf. I have to align the fasta and then make a phylo-tree. These takes a really long time more than vcf-kit. And my pc usually breakdown at the alignment step.

Thanks for your work and I would like to introduce the vcf-kit to my lab colleagues.

danielecook commented 4 years ago

Thanks!

danielecook commented 4 years ago

@Haoyuan17

I noticed that "seq-collection" could convert VCF to JSON. I am not familiar with JSON. Could I use JSON to get a newick file? How the JSON works with large file?

JSON format is not really useful from a genetics perspective; It is designed for sending data and might be useful if you wanted to send VCF data to a web browser.