omerwe / polyfun

PolyFun (POLYgenic FUNctionally-informed fine-mapping)
MIT License
89 stars 22 forks source link

compute_ldscores.py: line 200 TypeError: can't multiply sequence by non-int of type 'float' #25

Closed dwightman closed 4 years ago

dwightman commented 4 years ago

Hello,

I am trying to run the computeldscores.py to calculate LDscores for my annotations. I am get an error at this step [INFO] Applying initial ldscores loop_, this is the error:

_Traceback (most recent call last): File "compute_ldscores.py", line 190, in df_ldscores = compute_ldscores(args) File "compute_ldscores.py", line 149, in compute_ldscores ldscores = geno_array.ldScoreVarBlocks(block_left, args.chunk_size, annot=annot_values) File "/home/doug/Documents/Polyfun/polyfun/ldsc_polyfun/ldscore.py", line 127, in ldScoreVarBlocks return self.corSumVarBlocks(block_left, c, func, snp_getter, annot) File "/home/doug/Documents/Polyfun/polyfun/ldsc_polyfun/ldscore.py", line 200, in corSumVarBlocks cor_sum[l_A:l_A+b, :] += np.dot(rfuncAB, annot[l_B:l_B+c, :]) File "<__array_function__ internals>", line 6, in dot TypeError: can't multiply sequence by non-int of type 'float'_

I am not sure what is causing this error, do you have any ideas?

This is the command I am using:

_python computeldscores.py \ --bfile ../AD31chr1polyfun \ --annot ../gFIUDWchr1anno.gz \ --out ../chr1.parquet \ --ld-wind-kb 2000

Where AD31chrpolyfun is the stem for plink formatted bed,bim,fam files. The snp name format in the plink files is chr:pos:A1:A2, where A1 and A2 are organised alphabetically.

The format of gFIUDWchr1anno.gz looks like this SNP CHR BP A1 A2 rsID Z N ANNO1 ANNO2 1:100000012:G:T 1 100000012 T G rs10875231 -1.2578195807 652942.54 1 0

Both files have the exact same snps in them. Any ideas what I can do to resolve the problem?

Cheers, Doug

omerwe commented 4 years ago

Hi Doug,

The annotation files don't require an rsid column. The computation failed because the script thought it's an annotation column but it's not numeric. I just updated the script to make sure that all the annotation columns are numeric.

Please let me know if this resolves the issue!

Omer

On Mon, Mar 2, 2020 at 7:42 AM dwightman notifications@github.com wrote:

Hello,

I am trying to run the compute_ldscores.py to calculate LDscores for my annotations. I am get an error at this step [INFO] Applying initial ldscores loop, this is the error:

Traceback (most recent call last): File "compute_ldscores.py", line 190, in df_ldscores = compute_ldscores(args) File "compute_ldscores.py", line 149, in compute_ldscores ldscores = geno_array.ldScoreVarBlocks(block_left, args.chunk_size, annot=annot_values) File "/home/doug/Documents/Polyfun/polyfun/ldsc_polyfun/ldscore.py", line 127, in ldScoreVarBlocks return self.corSumVarBlocks(block_left, c, func, snp_getter, annot) File "/home/doug/Documents/Polyfun/polyfun/ldsc_polyfun/ldscore.py", line 200, in corSumVarBlocks cor_sum[l_A:l_A+b, :] += np.dot(rfuncAB, annot[l_B:l_B+c, :]) File "", line 6, in dot TypeError: can't multiply sequence by non-int of type 'float'

I am not sure what is causing this error, do you have any ideas?

This is the command I am using:

python compute_ldscores.py --bfile ../AD31chr1polyfun --annot ../gFIUDWchr1anno.gz --out ../chr1.parquet --ld-wind-kb 2000

Where AD31chrpolyfun is the stem for plink formatted bed,bim,fam files. The snp name format in the plink files is chr:pos:A1:A2, where A1 and A2 are organised alphabetically.

The format of gFIUDWchr1anno.gz looks like this

SNP CHR BP A1 A2 rsID Z N ANNO1 ANNO2 1:100000012:G:T 1 100000012 T G rs10875231 -1.2578195807 652942.54 1 0

Both files have the exact same snps in them. Any ideas what I can do to resolve the problem?

Cheers, Doug

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/omerwe/polyfun/issues/25?email_source=notifications&email_token=ACNCB46PRMXW4JSV2HQFQILRFOSUZA5CNFSM4K7TPKHKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IRVYPAQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACNCB46AL3TORFQDAS2WQ5DRFOSUZANCNFSM4K7TPKHA .

dwightman commented 4 years ago

Great, thanks. I removed the rsID column and it works well.

Doug