Closed dwightman closed 4 years ago
Hi Doug,
The annotation files don't require an rsid column. The computation failed because the script thought it's an annotation column but it's not numeric. I just updated the script to make sure that all the annotation columns are numeric.
Please let me know if this resolves the issue!
Omer
On Mon, Mar 2, 2020 at 7:42 AM dwightman notifications@github.com wrote:
Hello,
I am trying to run the compute_ldscores.py to calculate LDscores for my annotations. I am get an error at this step [INFO] Applying initial ldscores loop, this is the error:
Traceback (most recent call last): File "compute_ldscores.py", line 190, in df_ldscores = compute_ldscores(args) File "compute_ldscores.py", line 149, in compute_ldscores ldscores = geno_array.ldScoreVarBlocks(block_left, args.chunk_size, annot=annot_values) File "/home/doug/Documents/Polyfun/polyfun/ldsc_polyfun/ldscore.py", line 127, in ldScoreVarBlocks return self.corSumVarBlocks(block_left, c, func, snp_getter, annot) File "/home/doug/Documents/Polyfun/polyfun/ldsc_polyfun/ldscore.py", line 200, in corSumVarBlocks cor_sum[l_A:l_A+b, :] += np.dot(rfuncAB, annot[l_B:l_B+c, :]) File "
", line 6, in dot TypeError: can't multiply sequence by non-int of type 'float' I am not sure what is causing this error, do you have any ideas?
This is the command I am using:
python compute_ldscores.py --bfile ../AD31chr1polyfun --annot ../gFIUDWchr1anno.gz --out ../chr1.parquet --ld-wind-kb 2000
Where AD31chrpolyfun is the stem for plink formatted bed,bim,fam files. The snp name format in the plink files is chr:pos:A1:A2, where A1 and A2 are organised alphabetically.
The format of gFIUDWchr1anno.gz looks like this
SNP CHR BP A1 A2 rsID Z N ANNO1 ANNO2 1:100000012:G:T 1 100000012 T G rs10875231 -1.2578195807 652942.54 1 0
Both files have the exact same snps in them. Any ideas what I can do to resolve the problem?
Cheers, Doug
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/omerwe/polyfun/issues/25?email_source=notifications&email_token=ACNCB46PRMXW4JSV2HQFQILRFOSUZA5CNFSM4K7TPKHKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IRVYPAQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNCB46AL3TORFQDAS2WQ5DRFOSUZANCNFSM4K7TPKHA .
Great, thanks. I removed the rsID column and it works well.
Doug
Hello,
I am trying to run the computeldscores.py to calculate LDscores for my annotations. I am get an error at this step [INFO] Applying initial ldscores loop_, this is the error:
_Traceback (most recent call last): File "compute_ldscores.py", line 190, in
df_ldscores = compute_ldscores(args)
File "compute_ldscores.py", line 149, in compute_ldscores
ldscores = geno_array.ldScoreVarBlocks(block_left, args.chunk_size, annot=annot_values)
File "/home/doug/Documents/Polyfun/polyfun/ldsc_polyfun/ldscore.py", line 127, in ldScoreVarBlocks
return self.corSumVarBlocks(block_left, c, func, snp_getter, annot)
File "/home/doug/Documents/Polyfun/polyfun/ldsc_polyfun/ldscore.py", line 200, in corSumVarBlocks
cor_sum[l_A:l_A+b, :] += np.dot(rfuncAB, annot[l_B:l_B+c, :])
File "<__array_function__ internals>", line 6, in dot
TypeError: can't multiply sequence by non-int of type 'float'_
I am not sure what is causing this error, do you have any ideas?
This is the command I am using:
_python computeldscores.py \ --bfile ../AD31chr1polyfun \ --annot ../gFIUDWchr1anno.gz \ --out ../chr1.parquet \ --ld-wind-kb 2000
Where AD31chrpolyfun is the stem for plink formatted bed,bim,fam files. The snp name format in the plink files is chr:pos:A1:A2, where A1 and A2 are organised alphabetically.
The format of gFIUDWchr1anno.gz looks like this SNP CHR BP A1 A2 rsID Z N ANNO1 ANNO2 1:100000012:G:T 1 100000012 T G rs10875231 -1.2578195807 652942.54 1 0
Both files have the exact same snps in them. Any ideas what I can do to resolve the problem?
Cheers, Doug