Closed ccymak closed 4 years ago
yes, your vcf should be bgzipped and indexed.
Dear Brent,
Despite putting in a vcf.gz file, these errors still occur. Is the .ix indexer the problem? Could it have anything to do with the python version?
Many Thanks
Christopher C Mak Department of Paediatrics and Adolescent Medicine LKS Faculty of Medicine The University of Kong Hong
python3/3.6.4 is loaded 2019-10-25 09:54:44 paedbc01 peddy.cli[407545] INFO Running Peddy version 0.4.3 /home/ccymak/peddy/peddy/cli.py:198: FutureWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing
See the documentation here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated ped_df = ped_df.ix[samples, :] /home/ccymak/.local/lib/python3.6/site-packages/pandas/core/indexing.py:822: FutureWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing
See the documentation here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated retval = getattr(retval, self.name)._getitem_axis(key, axis=i) 2019-10-25 09:54:44 paedbc01 peddy.cli[407545] INFO [1;31mped_check [0m 2019-10-25 09:54:50 paedbc01 peddy.peddy[407545] INFO plotting 2019-10-25 09:54:51 paedbc01 peddy.cli[407545] INFO ran in 7.7 seconds 2019-10-25 09:54:52 paedbc01 peddy.cli[407545] INFO [1;31mhet_check [0m 2019-10-25 09:54:58 paedbc01 peddy.pca[407545] INFO loaded and subsetted thousand-genomes genotypes (shape: (2504, 11496)) in 0.6 seconds /home/ccymak/.local/lib/python3.6/site-packages/sklearn/svm/base.py:193: FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning. "avoid this warning.", FutureWarning) 2019-10-25 09:54:59 paedbc01 peddy.pca[407545] INFO ran randomized PCA on thousand-genomes samples at 11496 sites in 0.7 seconds 2019-10-25 09:54:59 paedbc01 peddy.pca[407545] INFO Projected thousand-genomes genotypes and sample genotypes and predicted ancestry via SVM in 0.2 seconds 2019-10-25 09:55:00 paedbc01 peddy.cli[407545] INFO ran in 8.2 seconds /home/ccymak/peddy/peddy/cli.py:224: FutureWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing
See the documentation here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated ped_df[col_name] = list(df[col].ix[samples]) 2019-10-25 09:55:00 paedbc01 peddy.cli[407545] INFO [1;31msex_check [0m no intervals found for b'/home/ccymak/tof_exome/TOF_Solexa_99.genotypecalls.vcf.gz' at X:2781480 2019-10-25 09:55:01 paedbc01 peddy.peddy[407545] INFO sex-check: 0 skipped / 10000 kept 2019-10-25 09:55:01 paedbc01 peddy.cli[407545] INFO ran in 1.3 seconds
On Thu, Oct 24, 2019 at 8:54 PM Brent Pedersen notifications@github.com wrote:
yes, your vcf should be bgzipped and indexed.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/brentp/peddy/issues/72?email_source=notifications&email_token=AC46UKV63N3EDOOFZP6B5BLQQGLGTA5CNFSM4JEQF3P2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECE5JUY#issuecomment-545903827, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC46UKQJA2SSTNN64WTXF3LQQGLGTANCNFSM4JEQF3PQ .
Hi, does your VCF file also have an index file (.csi or .tbi) ?
Dear Brent,
Thanks so much for your reply. I have reperformed the indexing and it works fine now, I suppose I can just ignore the rest of the warnings?
For Relatedness check, am I right in saying that related samples will have a high IBS2 and a low IBS0 putting the sample in the top left above the other samples?
Thanks again for creating such a great tool!
Regards,
Christopher C Mak Department of Paediatrics and Adolescent Medicine LKS Faculty of Medicine The University of Kong Hong
2019-10-30 01:39:57 hpch01 peddy.cli[168547] INFO Running Peddy version 0.4.3 /home/ccymak/peddy/peddy/cli.py:198: FutureWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing
See the documentation here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated ped_df = ped_df.ix[samples, :] /home/ccymak/.local/lib/python3.6/site-packages/pandas/core/indexing.py:822: FutureWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing
See the documentation here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated retval = getattr(retval, self.name)._getitem_axis(key, axis=i) 2019-10-30 01:39:57 hpch01 peddy.cli[168547] INFO [1;31mped_check [0m 2019-10-30 01:40:10 hpch01 peddy.peddy[168547] INFO plotting 2019-10-30 01:40:12 hpch01 peddy.cli[168547] INFO ran in 14.6 seconds 2019-10-30 01:40:12 hpch01 peddy.cli[168547] INFO [1;31mhet_check [0m 2019-10-30 01:40:21 hpch01 peddy.pca[168547] INFO loaded and subsetted thousand-genomes genotypes (shape: (2504, 11496)) in 0.8 seconds /home/ccymak/.local/lib/python3.6/site-packages/sklearn/svm/base.py:193: FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning. "avoid this warning.", FutureWarning) 2019-10-30 01:40:22 hpch01 peddy.pca[168547] INFO ran randomized PCA on thousand-genomes samples at 11496 sites in 1.1 seconds 2019-10-30 01:40:22 hpch01 peddy.pca[168547] INFO Projected thousand-genomes genotypes and sample genotypes and predicted ancestry via SVM in 0.2 seconds 2019-10-30 01:40:23 hpch01 peddy.cli[168547] INFO ran in 11.3 seconds /home/ccymak/peddy/peddy/cli.py:224: FutureWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing
See the documentation here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated ped_df[col_name] = list(df[col].ix[samples]) 2019-10-30 01:40:23 hpch01 peddy.cli[168547] INFO [1;31msex_check [0m no intervals found for b'/home/ccymak/tof_exome/TOF_Solexa_99.genotypecalls.vcf.gz' at X:2781480 2019-10-30 01:40:25 hpch01 peddy.peddy[168547] INFO sex-check: 0 skipped / 10000 kept 2019-10-30 01:40:25 hpch01 peddy.cli[168547] INFO ran in 2.1 seconds
On Mon, Oct 28, 2019 at 10:46 AM Brent Pedersen notifications@github.com wrote:
Hi, does your VCF file also have an index file (.csi or .tbi) ?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/brentp/peddy/issues/72?email_source=notifications&email_token=AC46UKTY27WF3457PQTOJYLQQZHAPA5CNFSM4JEQF3P2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECLQ3NI#issuecomment-546770357, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC46UKVQI5AA2ACTR4PZPXDQQZHAPANCNFSM4JEQF3PQ .
yes, that's correct. you can also change 1 axis to show relatedness which might be more easily interpreted.
May I ask if this is a problem with the vcf file?
peddy --plot -p $cpu --loglevel DEBUG --prefix tof99test /home/ccymak/tof_exome/TOF_Solexa_99.genotypecalls.vcf /home/ccymak/tof_exome/2018_SS-180814-01a/TOF_99.ped
Thanks
python3/3.6.4 is loaded 2019-10-24 15:47:00 paedbc01 peddy.cli[223643] INFO Running Peddy version 0.4.3 /home/ccymak/peddy/peddy/cli.py:198: FutureWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing
See the documentation here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated ped_df = ped_df.ix[samples, :] /home/ccymak/.local/lib/python3.6/site-packages/pandas/core/indexing.py:822: FutureWarning: .ix is deprecated. Please use .loc for label based indexing or .iloc for positional indexing
See the documentation here: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#ix-indexer-is-deprecated retval = getattr(retval, self.name)._getitem_axis(key, axis=i) 2019-10-24 15:47:00 paedbc01 peddy.cli[223643] INFO [1;31mped_check[0m multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/software/python/3.6.4/lib/python3.6/multiprocessing/pool.py", line 119, in worker result = (True, func(*args, **kwds)) File "cyvcf2/cyvcf2.pyx", line 93, in cyvcf2.cyvcf2._par_relatedness File "cyvcf2/cyvcf2.pyx", line 783, in cyvcf2.cyvcf2.VCF._site_relatedness File "cyvcf2/cyvcf2.pyx", line 658, in gen_variants File "cyvcf2/cyvcf2.pyx", line 376, in call AssertionError: error loading tabix index for b'/home/ccymak/tof_exome/TOF_Solexa_99.genotypecalls.vcf' """
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/ccymak/.local/bin/peddy", line 11, in
load_entry_point('peddy', 'console_scripts', 'peddy')()
File "/home/mullinyu/.local/lib/python3.6/site-packages/click/core.py", line 764, in call
return self.main(args, kwargs)
File "/home/mullinyu/.local/lib/python3.6/site-packages/click/core.py", line 717, in main
rv = self.invoke(ctx)
File "/home/mullinyu/.local/lib/python3.6/site-packages/click/core.py", line 956, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/home/mullinyu/.local/lib/python3.6/site-packages/click/core.py", line 555, in invoke
return callback(args, kwargs)
File "/home/ccymak/peddy/peddy/cli.py", line 209, in peddy
in ("ped_check", "het_check", "sex_check")]):
File "/home/ccymak/peddy/peddy/cli.py", line 43, in run
prefix=prefix, kwargs)
File "/home/ccymak/peddy/peddy/peddy.py", line 970, in ped_check
min_depth=min_depth, each=each)
File "cyvcf2/cyvcf2.pyx", line 39, in cyvcf2.cyvcf2.par_relatedness
File "/software/python/3.6.4/lib/python3.6/multiprocessing/pool.py", line 735, in next
raise value
AssertionError: error loading tabix index for b'/home/ccymak/tof_exome/TOF_Solexa_99.genotypecalls.vcf'