borenstein-lab / fishtaco

FishTaco (Functional Shifts Taxonomic Contributors) is a metagenomic computational framework that aims to identify the driver taxa of microbiome functional shifts
Other
23 stars 4 forks source link

Error message: run FishTaco with PICRUSt-derived metagenomic functional profile #9

Closed Mingye-Peng closed 3 years ago

Mingye-Peng commented 3 years ago

Hi author, When I run the script of run_fishtaco.py: run_fishtaco.py -ta gg13.5_otu_table_norm.txt -fu metagenome_predictions_MUSiCC_Normalized.txt -l mapping.txt -gc ko_13_5_precalculated.tab -op fishtaco_out_no_inf -map_function_level none -functional_profile_already_corrected_with_musicc -assessment single_taxa -log, I encountered the following error message: Traceback (most recent call last): File "/home/apps/miniconda3/envs/qiime1/bin/run_fishtaco.py", line 100, in main(vars(given_args)) File "/home/apps/miniconda3/envs/qiime1/lib/python2.7/site-packages/fishtaco/compute_contribution_to_DA.py", line 174, in main if np.sum(np.isnan(taxa_to_function_data.values)) > 0: TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''.

I checked the format of all files, and I found that all txt flie are separated by tab. The profile of ko_13_5_precalculated.tab was downloaded through the website (http://picrust.github.io/picrust/picrust_precalculated_files.html#id1). I modified the GreenGenes IDs in both the taxonomic abundance (gg13.5_otu_table_norm.txt) and genomic content tables (ko_13_5_precalculated.tab) to prefix them with a non-numeric character (e.g., changing “228054” to “t228054”).

The following is a screenshot of the relevant information for each input file: image image image image

Thanks for your answer

engal commented 3 years ago

Hi,

I'm not entirely sure what is causing the issue, but it appears to be related to processing your taxa-to-function (i.e. genomic content) table. I've just tested with the standard PICRUSt table, and it looks like it should still be working. Could you copy the first 5-10 lines of the ko_13_5_precalculated.tab file you used when you got the error into a new file and send that smaller subset file so that I can test with it?

Thanks!

Mingye-Peng commented 3 years ago

Hi,

I have sent all relevant files for running the fishtao script to your emali.

please receive.

Thanks!

------------------ 原始邮件 ------------------ 发件人: "borenstein-lab/fishtaco" <notifications@github.com>; 发送时间: 2020年11月3日(星期二) 上午7:21 收件人: "borenstein-lab/fishtaco"<fishtaco@noreply.github.com>; 抄送: "個人←才孤单"<404668852@qq.com>;"Author"<author@noreply.github.com>; 主题: Re: [borenstein-lab/fishtaco] Error message: run FishTaco with PICRUSt-derived metagenomic functional profile (#9)

Hi,

I'm not entirely sure what is causing the issue, but it appears to be related to processing your taxa-to-function (i.e. genomic content) table. I've just tested with the standard PICRUSt table, and it looks like it should still be working. Could you copy the first 5-10 lines of the ko_13_5_precalculated.tab file you used when you got the error into a new file and send that smaller subset file so that I can test with it?

Thanks!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

engal commented 3 years ago

Hi,

If you attempted to email the files separately (outside of GitHub), I do not think we have received them. However, I do not think all files will be necessary, at least for addressing this issue. If you could copy the first 5-10 lines of your genomic content file and post them as an attachment in this thread, I might be able to figure out what's going wrong.

Thanks!

Mingye-Peng commented 3 years ago

Hi,

I'm very sorry, so I copy the first 5-10 lines of genomic content file (unzip ko_13_5_precalculated.tab.gz).

the content of the genomic content file is as follows: ko_13_5_precalculated_sub.txt

I found that the last column of genomic content file (ko_13_5_precalculated.tab) does not start with KO, but with metadata_NSTI.

Looking forward to receiving your help, thanks!

engal commented 3 years ago

Hi,

Can you try running FishTaco using the genomic content file you sent in your previous message and let me know what happens? I know it's not the full file, but my testing suggests that it shouldn't be giving you the same error that you originally reported, and I want to make sure that's true for you too.

Thanks!

Mingye-Peng commented 3 years ago

Hi,

I‘m very sorry, I didn't try to run the script with the subset of genomic content file (ko_13_5_precalculated_sub.txt), I found that Writing output... Note that since there are no differentially abundant functions, there is no real output... and I found a series of related documents with fishtaco_out_no_infSTAT*, I think the script runs successfully. But, when I run the script with the complete genomic content file (ko_13_5_precalculated.tab, from picrust website, and modified the information in the first column), I still get the previous error.
error

The virtual environment of fishtaco was installed using conda in qiime1 virtual environment. In addition, Can you upload the genomic content file to github or some others places? It can be downloaded by those who need it.

Thanks!

engal commented 3 years ago

Hi,

It does look like there are some formatting issues with the existing PICRUSt genomic content table, but I think you should be able to work around them by removing the last two lines of the file. These contain function descriptions for the columns and cause parsing issues in FishTaco. There are also a couple of empty rows somewhere in the middle of the file, but I don't think those are causing issues at the moment.

I'm currently looking into a way to host a "fixed" version of this file, but it's too large to directly host on GitHub for free.

Let me know if removing those two lines helps.

Mingye-Peng commented 3 years ago

Hi,

Thank you very much for your advice, When I removed the last two lines of the PICRUSt genomic content table, the Fishtaco script can be run successfully. I hope you can explain in the readme file of Fishtaco who need to use the scirpt in the future.

Good luck!

engal commented 3 years ago

Glad it's working for you, and thanks for the suggestion! I've updated the FAQ to reflect this necessary edit. I've also just pushed an update to FishTaco that should allow you to use purely numeric OTU IDs (you won't need to add a letter to the front of each OTU ID anymore).

I'm going to close this issue out, but please let us know if you run into any other issues.