biocore / songbird

Vanilla regression methods for microbiome differential abundance analysis
BSD 3-Clause "New" or "Revised" License
58 stars 25 forks source link

Problems with installation #158

Closed pig-raffles closed 2 years ago

pig-raffles commented 2 years ago

Hi,

I am having problems installing Songbird. Whilst the installation process seems to be successful, with no error messages, when I try to run Songbird I get errors. I have tried both the standalone version and the qiime plugin, neither seem to work

When I install the standalone version and try to use songbird, I get the following error message:

"Traceback (most recent call last): File "/Users/alan02/miniconda3/envs/songbird_env/bin/songbird", line 8, in from songbird.multinomial import MultRegression File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/songbird/multinomial.py", line 3, in from tensorflow.contrib.distributions import Multinomial, Normal File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/tensorflow/init.py", line 50, in getattr module = self._load() File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/tensorflow/init.py", line 44, in _load module = _importlib.import_module(self.name) File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/tensorflow_core/contrib/init.py", line 39, in from tensorflow.contrib import compiler File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/tensorflow_core/contrib/compiler/init.py", line 21, in from tensorflow.contrib.compiler import jit File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/tensorflow_core/contrib/compiler/init.py", line 22, in from tensorflow.contrib.compiler import xla File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/tensorflow_core/contrib/compiler/xla.py", line 22, in from tensorflow.python.estimator import model_fn as model_fn_lib File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/tensorflow_core/python/estimator/model_fn.py", line 26, in from tensorflow_estimator.python.estimator import model_fn File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/tensorflow_estimator/init.py", line 10, in from tensorflow_estimator._api.v1 import estimator File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/tensorflow_estimator/_api/v1/estimator/init.py", line 10, in from tensorflow_estimator._api.v1.estimator import experimental File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/tensorflow_estimator/_api/v1/estimator/experimental/init.py", line 10, in from tensorflow_estimator.python.estimator.canned.dnn import dnn_logit_fn_builder File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/canned/dnn.py", line 27, in from tensorflow_estimator.python.estimator import estimator File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 36, in from tensorflow.python.profiler import trace ImportError: cannot import name 'trace' from 'tensorflow.python.profiler' (/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/tensorflow_core/python/profiler/init.py)"

With the qiime plugin, I get the following error message: "QIIME is caching your current deployment for improved performance. This may take a few moments and should only happen once per deployment. Illegal instruction: 4"

Finally, how do I access the example data files from songbird? Whilst I have my own files I would like to make sure songbird is running correctly. I have tried looking for them on my computer but cant locate them.

Thanks in advance for any advice you can give on getting songbird to run properly

mortonjt commented 2 years ago

Hi @pig-raffles it would be helpful if you included the exact installation instructions you used. For instance, did you pip install tensorflow=1.15.2?

Regarding example data files, its all on the repo : https://github.com/biocore/songbird/tree/master/data It should be documented in the README, but feel free to follow up if it isn't already clear

pig-raffles commented 2 years ago

Thanks for getting back to me so quickly

Sorry was being dumb about the example files. Was looking on my computer rather than GitHub. Found it very easily.

For the standalone installation, the error message I showed above was from exactly following the README.ml instructions:

conda create -n songbird_env songbird "pandas>=0.18.0,<1" -c conda-forge source activate songbird_env

When I additionally use "pip install "tensorflow<2"", the installation is apparently successful but I get the following error message:

"ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. songbird 1.0.3 requires nose>=1.3.7, which is not installed"

I tried running songbird on the example data without installing "nose". Despite the above error message this seems to work and produces a "differential.tsv" file

When I try running the same default run command with my own files.

songbird multinomial \ --input-biom KO_SaltvsFresh_Ant.biom \ --metadata-file Metadata_SaltvsFresh_Ant.txt \ --formula "Treatment" \ --epochs 10000 \ --differential-prior 0.5 \ --training-column Testing \ --summary-interval 1 \ --summary-dir results

I get the following error message:

"Traceback (most recent call last): File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2897, in get_loc return self._engine.get_loc(key) File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'Testing'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/Users/alan02/miniconda3/envs/songbird_env/bin/songbird", line 225, in songbird() File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/click/core.py", line 1128, in call return self.main(args, kwargs) File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/click/core.py", line 1053, in main rv = self.invoke(ctx) File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/click/core.py", line 1659, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/click/core.py", line 1395, in invoke return ctx.invoke(self.callback, ctx.params) File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/click/core.py", line 754, in invoke return __callback(args, **kwargs) File "/Users/alan02/miniconda3/envs/songbird_env/bin/songbird", line 180, in multinomial seed=random_seed, File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/songbird/util.py", line 191, in split_training train_idx = metadata.loc[design.index, training_column] == "Train" File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/pandas/core/indexing.py", line 1418, in getitem return self._getitem_tuple(key) File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/pandas/core/indexing.py", line 805, in _getitem_tuple return self._getitem_lowerdim(tup) File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/pandas/core/indexing.py", line 929, in _getitem_lowerdim section = self._getitem_axis(key, axis=i) File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/pandas/core/indexing.py", line 1850, in _getitem_axis return self._get_label(key, axis=axis) File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/pandas/core/indexing.py", line 160, in _get_label return self.obj._xs(label, axis=axis) File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/pandas/core/generic.py", line 3729, in xs return self[key] File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/pandas/core/frame.py", line 2995, in getitem indexer = self.columns.get_loc(key) File "/Users/alan02/miniconda3/envs/songbird_env/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 2899, in get_loc return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas/_libs/index.pyx", line 107, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1607, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1614, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'Testing'"

Is there a problem with my file formatting?

pig-raffles commented 2 years ago

The biom format (Hierarchical Data Format (version 5) data) seems to be the same for both the example feature-table.biom and my biom file.

Is there anything else that could be problematic with my data files?

mortonjt commented 2 years ago

The error message you are getting here is due to your sample metadata -- you didn't specify a Testing column. You have 2 options -- either add an additional column with train/test labels for each sample, or just drop it and automatically have Songbird fill that out for you.

pig-raffles commented 2 years ago

Thanks again,

I am now getting the error message: ValueError: initial_value must have a shape specified: Tensor("random_normal:0", shape=(2, ?), dtype=float32)

In another issue you state that this is caused by an empty datase, with potential issues being:

  1. The samples are all filtered out, either because the table is normalized to 1 or all of the samples have too few reads.
  2. There are too few samples in your subset ( it is worth checking out the --min-feature-count and min-sample-count args).
  3. The sample metadata sample names don't match the sample names in the biom table

I have checked all these and can't find an issue, the min number of counts per sample is > 1000, and the min number of samples a feature needs to be observed in is not an issue. The names appear to be identical in the biom file and the metadata.

Could there be another issue causing this? biom format? too few samples?

Thanks,

Alan

mortonjt commented 2 years ago

I'd try the following to double check

  1. --min-sample-count 0
  2. --min-feature-count 0
  3. Run the following
    import biom
    import pandas as pd
    table = biom.Table('<your_file.biom>')
    md = pd.read_table('<your metadata file>', index_col=0)
    print(len(set(md.index) & set(table.ids()))

    If you are still getting errors and your sample ids are indeed matching, it'll help if you attach your files to this issue to help with debugging.

pig-raffles commented 2 years ago

Hi,

When I ran "table = biom.Table('')", I got the following error message:

"Traceback (most recent call last): File "", line 1, in TypeError: init() missing 2 required positional arguments: 'observation_ids' and 'sample_ids'"

This suggests to me that my original OTU file was in the wrong format or the conversion script is not compatible. To create the biom file. I used the following commands:

"biom convert -i KO_pred_ancom_SaltvsFresh_Ant.txt -o KO_SaltvsFresh_Ant.biom --to-hdf5"

I have attached the original tab-delimited text file that I converted into biom format KO_pred_ancom_SaltvsFresh_Ant.txt

There did not seem to be any errors associated with the metadata file

mortonjt commented 2 years ago

hmm. I won't be able to reproduce your exact error without the exact commands.

But with the text file that you have, you can still run the following

import biom
import pandas as pd
table = pd.read_table('KO_pred_ancom_SaltvsFresh_Ant.txt', index_col=0
md = pd.read_table('<your metadata file>', index_col=0)
print(len(set(md.index) & set(table.columns))

P.S. I'm going to close this issue, since this is much more appropriate for the qiime2 forums.