fastlmm / FaST-LMM

Python version of Factored Spectrally Transformed Linear Mixed Models
https://fastlmm.github.io/
Apache License 2.0
47 stars 11 forks source link

ValueError: No objects to concatenate #17

Closed snowformatics closed 3 years ago

snowformatics commented 3 years ago

Hi,

I have a problem with some bed files generated by PLINK. Using a Bed file with different settings and same phenotype file, sometimes gives such an error message:

`... read 298900 SNPs in 49.29 seconds read 299096 SNPs in 49.32 seconds 49.32 seconds elapsed Ending '_read_with_standardizing' Starting findH2 h2=0.99999 Traceback (most recent call last): File "C:/Users/id/PycharmProjects/BCC_Experiments/lmm/lmm01.py", line 31, in results_df = single_snp(bed_fn, pheno_fn) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\fastlmm\association\single_snp.py", line 248, in single_snp runner = runner) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 202, in map_reduce result = runner.run(dist) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\runner\local.py", line 48, in run result = _run_all_in_memory(distributable) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\runner__init.py", line 30, in _run_all_in_memory return work.reduce(result_sequence) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 77, in reduce return self.reducer(output_seq) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\fastlmm\association\single_snp.py", line 228, in reducer_closure for e in frame_sequence: File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\runner__init.py", line 14, in work_sequence_to_result_sequence result = work() File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 65, in yield lambda i=i, input_arg=input_arg: self.dowork(i, input_arg) # the 'i=i',etc is need to get around a strangeness in Python File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 92, in dowork result = _run_all_in_memory(work) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\runner__init__.py", line 25, in _run_all_in_memory return work() File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 91, in work = lambda : self.mapper(input_arg) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\fastlmm\association\single_snp.py", line 223, in nested_closure runner=Local(), xp=xp) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\fastlmm\association\single_snp.py", line 700, in _internal_single runner=runner) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 202, in map_reduce result = runner.run(dist) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\runner\local.py", line 48, in run result = _run_all_in_memory(distributable) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\runner\init__.py", line 30, in _run_all_in_memory return work.reduce(result_sequence) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 77, in reduce return self.reducer(output_seq) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\fastlmm\association\single_snp.py", line 687, in reducer_closure frame = pd.concat(result_sequence) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pandas\core\reshape\concat.py", line 295, in concat sort=sort, File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pandas\core\reshape\concat.py", line 342, in init__ raise ValueError("No objects to concatenate") ValueError: No objects to concatenate

`

When I use the same bed file with PLINK association run it works, so it's not a problem of the BED file.

This is the log Bed file from PLINK which works with fast-lmm:

PLINK v2.00a2.3 64-bit (24 Jan 2020) Options in effect: --allow-extra-chr --geno 0.025 --hwe 1E-6 --keep ids_wgs_200cc.txt --maf 0.03 --make-bed --out wgs_200cc_0025_003 --set-missing-var-ids @:# --vcf SNP_matrix_WGS_300_samples.vcf.gz

This is the log file of a BED which does not work with fast-lmm:

PLINK v2.00a2.3 64-bit (24 Jan 2020) Options in effect: --allow-extra-chr --geno 0.05 --hwe 1E-5 --keep ids_wgs_200cc.txt --maf 0.03 --make-bed --out WGS_300_005_003_5 --set-missing-var-ids @:# --vcf SNP_matrix_WGS_300_samples.vcf.gz

Do you have any idea what could be the problem?

Thanks a lot in advance Stefanie

CarlKCarlK commented 3 years ago

Stefanie,

Thanks for using FaST-LMM and thanks for reporting this problem.

It looks like it is complaining that there are no results rows, but that is weird because it also says it is reading SNPs and finding h2 (the relative weight between the similarity matrix and an identity matrix).

import logging logging.basicConfig(level=logging.INFO)

Yours, Carl

Carl Kadie, Ph.D. FaST-LMM & PySnpTools Teamhttps://fastlmm.github.io/ (Microsoft Research, retired) https://www.linkedin.com/in/carlk/

Join the FaST-LMM user discussion and announcement list via @.***?subject=Subscribe> (or use web sign uphttps://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmail.python.org%2Fmailman3%2Flists%2Ffastlmm-user.python.org&data=02%7C01%7C%7C13a5c33d7cd84cad5cdf08d7bba56e20%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637184191498409587&sdata=2CQWjQEwOpQol2rQ1eoyVTgY8WvInV8UH31Wtl68FzY%3D&reserved=0)

From: snowformatics @.> Sent: Thursday, June 10, 2021 12:48 AM To: fastlmm/FaST-LMM @.> Cc: Subscribed @.***> Subject: [fastlmm/FaST-LMM] ValueError: No objects to concatenate (#17) Importance: High

Hi,

I have a problem with some bed files generated by PLINK. Using a Bed file with different settings and same phenotype file, sometimes gives such an error message:

`... read 298900 SNPs in 49.29 seconds read 299096 SNPs in 49.32 seconds 49.32 seconds elapsed Ending '_read_with_standardizing' Starting findH2 h2=0.99999 Traceback (most recent call last): File "C:/Users/id/PycharmProjects/BCC_Experiments/lmm/lmm01.py", line 31, in results_df = single_snp(bed_fn, pheno_fn) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\fastlmm\association\single_snp.py", line 248, in single_snp runner = runner) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 202, in map_reduce result = runner.run(dist) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\runner\local.py", line 48, in run result = run_all_in_memory(distributable) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\runner_init.py", line 30, in run_all_in_memory return work.reduce(result_sequence) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 77, in reduce return self.reducer(output_seq) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\fastlmm\association\single_snp.py", line 228, in reducer_closure for e in frame_sequence: File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\runner_init.py", line 14, in work_sequence_to_result_sequence result = work() File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 65, in yield lambda i=i, input_arg=input_arg: self.dowork(i, input_arg) # the 'i=i',etc is need to get around a strangeness in Python File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 92, in dowork result = run_all_in_memory(work) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\runner_init.py", line 25, in _run_all_in_memory return work() File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 91, in work = lambda : self.mapper(input_arg) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\fastlmm\association\single_snp.py", line 223, in nested_closure runner=Local(), xp=xp) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\fastlmm\association\single_snp.py", line 700, in _internal_single runner=runner) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 202, in map_reduce result = runner.run(dist) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\runner\local.py", line 48, in run result = run_all_in_memory(distributable) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\runner_init.py", line 30, in _run_all_in_memory return work.reduce(result_sequence) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pysnptools\util\mapreduce1\mapreduce.py", line 77, in reduce return self.reducer(output_seq) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\fastlmm\association\single_snp.py", line 687, in reducer_closure frame = pd.concat(result_sequence) File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pandas\core\reshape\concat.py", line 295, in concat sort=sort, File "C:\Users\id\AppData\Local\Continuum\anaconda3\envs\gwas_flow\lib\site-packages\pandas\core\reshape\concat.py", line 342, in init raise ValueError("No objects to concatenate") ValueError: No objects to concatenate

`

When I use the same bed file with PLINK association run it works, so it's not a problem of the BED file.

This is the log Bed file from PLINK which works with fast-lmm:

PLINK v2.00a2.3 64-bit (24 Jan 2020) Options in effect: --allow-extra-chr --geno 0.025 --hwe 1E-6 --keep ids_wgs_200cc.txt --maf 0.03 --make-bed --out wgs_200cc_0025_003 --set-missing-var-ids @:# --vcf SNP_matrix_WGS_300_samples.vcf.gz

This is the log file of a BED which does not work with fast-lmm:

PLINK v2.00a2.3 64-bit (24 Jan 2020) Options in effect: --allow-extra-chr --geno 0.05 --hwe 1E-5 --keep ids_wgs_200cc.txt --maf 0.03 --make-bed --out WGS_300_005_003_5 --set-missing-var-ids @:# --vcf SNP_matrix_WGS_300_samples.vcf.gz

Do you have any idea what could be the problem?

Thanks a lot in advance Stefanie

- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ffastlmm%2FFaST-LMM%2Fissues%2F17&data=04%7C01%7C%7C9d0891706b194e0b9e0e08d92be417c7%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637589080909794611%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=RowZ2fj6nuEIhifjbutSiqogr%2BrWL%2FtvE7jofUUqP5Q%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABR65P4R22TFHGQVFA6IDWLTSBU3TANCNFSM46NRIZ2Q&data=04%7C01%7C%7C9d0891706b194e0b9e0e08d92be417c7%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637589080909794611%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=P%2BekzrJGF4%2FgrzVyXHcBWdMhpc7rBM4RmFAVGPDfk8A%3D&reserved=0.

snowformatics commented 3 years ago

Hi Carl,

thanks for your reply! I printed out a bunch of lines during the analysis and I found out that inside the second BIM file a Chromosome is labeled as 0 (= unknown). After replacing 0 with another number (8) it worked.

fast-lmm 0.5.5 Python 3.7 Windows 10

Best Stefanie

CarlKCarlK commented 3 years ago

Great!

From: snowformatics @.> Sent: Thursday, June 10, 2021 11:49 PM To: fastlmm/FaST-LMM @.> Cc: Carl Kadie @.>; Comment @.> Subject: Re: [fastlmm/FaST-LMM] ValueError: No objects to concatenate (#17) Importance: High

Hi Carl,

thanks for your reply! I printed out a bunch of lines during the analysis and I found out that inside the second BIM file a Chromosome is labeled as 0 (= unknown). After replacing 0 with another number (8) it worked.

fast-lmm 0.5.5 Python 3.7 Windows 10

Best Stefanie

- You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ffastlmm%2FFaST-LMM%2Fissues%2F17%23issuecomment-859324220&data=04%7C01%7C%7C9860da1892ab492fb79c08d92ca4fa52%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637589909343927257%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=RwM1itz5F3ZkPQGBI7MIhd4AKkOY1iMrT9Is67AUl2U%3D&reserved=0, or unsubscribehttps://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABR65PYMR7BPKYIB2ARQKULTSGWVLANCNFSM46NRIZ2Q&data=04%7C01%7C%7C9860da1892ab492fb79c08d92ca4fa52%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C637589909343927257%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=y5KXNkdjXACkwrxVHHNlJ77xjlVYDwPpllUeE%2FN2s18%3D&reserved=0.