Closed pengyan19 closed 3 years ago
what version of ipyrad are you using?
i used the ipyrad 0.9.63, i guess the pandas version which could affect this results.And i used the pandas which was 1.0.3. And i also used python2.7 to install this software, but i also failed in this steps which didn't have this module to transform vcf to hdf5.
Python 2.7 is no longer supported, so please use python3 and try again. Also please use the most recent version of ipyrad, as it's possible we've already fixed this problem. Let me know how it goes.
I have tried python 3.7,but I also met this question.could you recommend me use which pandas or numpy version to install.
Python 3.7 should work. If you install ipyrad in a clean conda environment then it will pull down all required libraries.
conda create -n ipyrad_env python=3.8
conda activate ipyrad_env
conda install -c conda-forge -c bioconda ipyrad
This will install the most recent version of ipyrad and all required libraries.
I also bulid new env ,i l also Python3.8 to install this software,but I guess panda which have new version,it could not have this using in pandas
This is what I have in a working environment:
numpy 1.19.4 py38hf0fd68c_1 conda-forge pandas 1.1.4 py38h0ef3d22_0 conda-forge
I also have the same question. i have bulid a new env, and i also install ipyrad 0.9.78. in this env, i install python3.8.can you send me a vcf file which you test in my email?this is my email:1300538321@qq.com. i test your vcf
Indexing VCF to HDF5 database file
VCF: 20026 SNPs; 251 scaffolds
[ ] 0% 0:00:00 | converting VCF to HDF5 Traceback (most recent call last):
File "03.vcf2hdf5.py", line 21, in
I emailed a vcf to test.
ok,i know the question which I caused, because I have too much chromsome .if i change the format to number,it works well.and i have other question when i used the treemix,i want to used all snp, but ipyrad have filtered as follows. And I need choose the best edge according to the likehood ?.But we didn't knows that which tree can explain more variants. do you know how to calculate? Samples: 221 Sites before filtering: 19366 Filtered (indels): 0 Filtered (bi-allel): 0 Filtered (mincov): 0 Filtered (minmap): 19246 Filtered (subsample invariant): 2611 Filtered (minor allele frequency): 0 Filtered (combined): 19264 Sites after filtering: 102 Sites containing missing values: 52 (50.98%) Missing values in SNP matrix: 615 (2.73%) SNPs (total): 102 SNPs (unlinked): 98 subsampled 98 unlinked SNPs
Ah good, glad you figured out the vcf conversion problem. I will close this issue as now the original problem has been resolved.
As for the treemix question, this is less an ipyrad issue than it is a question about usage, which is more appropriate for the gitter channel: https://gitter.im/dereneaton/ipyrad
I am not sure I understand your question exactly. Can you please try restating your question and posting it to the gitter channel? Thanks!
This is my code: converter = ipa.vcf_to_hdf5(name="yanfen_LD20K",data=vcf,ld_block_size=20000) But I met this question as follows,it may be cause by pandas ,can you help me to deal with it ValueError: Length of passed values is 2262938, index implies 20026.