chrisquince / DESMAN

De novo Extraction of Strains from MetAgeNomes
Other
69 stars 22 forks source link

Pandas error in Filter_Variant step while using Desman snakemake version #45

Open osvatic opened 4 years ago

osvatic commented 4 years ago

Desman fails during the Filter_Variant step. Probably an issue with pandas.

Attached is the running output (DESnake-7223047.txt) and the Bin_1_Filter_variant.log

DESnake-7223047.txt

Bin_1_Filter_variant.log

chrisquince commented 4 years ago

I have added a branch that is compatible with the latest Pandas it is called:

PandasRemove

but really I should remove pandas :)

icotto25 commented 4 years ago

Hello, I also had the problem with Filter_Variant and thanks to this issue I was able to solve it. Now I am trying to run the deman scrip and I am getting the same error: "AttributeError: 'DataFrame' object has no attribute 'as_matrix'". It this also an issue with pandas?

Thank you,

Irmarie

osvatic commented 4 years ago

Hey Irmarie,

This happened to me too. For me, it was a problem with which desman script I was calling. If you are calling the old desman script (with pandas) this will be the error that shows up. You should check to see if 'as_matrix' is in the desman script that snakemake is calling. If it is, you could need to reinstall the PandasRemove branch. You can also adjust the path for the desman script in the snakemake file to make sure that it is using the new (non-pandas) script.

chrisquince commented 4 years ago

Yes just to clarify for the fix you need a particular Desman branch. So:

git clone -b PandasRemove https://github.com/chrisquince/DESMAN.git

and then install that branch as you would normally.

I have not pushed these changes to the master branch as it could break installations using earlier Pandas versions. The best solution would be for me to recode everything without Pandas.

icotto25 commented 4 years ago

Thank you both for the help. I installed the PandaRemove branch and apparently it ran just fine (no errors).

icotto25 commented 4 years ago

Hi, now I am trying to run "resolvenhap.py" to determine optimal number of strain. However, I am getting the same error again (AttributeError: 'DataFrame' object has no attribute 'as_matrix') even though I am calling the script from the PandasRemove branch. Maybe there is something very basic I am missing but I don't have much experience in bioinformatics.

chrisquince commented 4 years ago

The problem here comes down to the pandas incompatabilities between versions. I have updated resolvenhap.py to use as_matrix in this script as I did in the others but for some reason you must be using an old pandas library when you try to run it. Are you running it exactly the same way as the others?

icotto25 commented 4 years ago

Yes, I am running it exactly the same as I ran the desman script. I installed the DESMAN PandaRemove branch in an environment called "py13" When I run the "resolvenhap.py" script only with the help flag I got this: [image: image.png]

However, this is what I got when I run it with the frequencies of haplotypes file:

[image: image.png]

On Sat, Jun 6, 2020 at 8:14 AM chrisquince notifications@github.com wrote:

The problem here comes down to the pandas incompatabilities between versions. I have updated resolvenhap.py to use as_matrix in this script as I did in the others but for some reason you must be using an old pandas library when you try to run it. Are you running it exactly the same way as the others?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/chrisquince/DESMAN/issues/45#issuecomment-640050737, or unsubscribe https://github.com/notifications/unsubscribe-auth/APXLSW5H4IZHITJDBJOJHU3RVIXKNANCNFSM4NEZNQRA .

-- Irmarie Cotto, MS Ph.D. Student, Environmental Engineering Civil and Environmental Engineering Department Northeastern University Office: 436 Snell Engineering Center 110 Forsyth St. Boston, MA cotto.i@husky.neu.edu 1 http://forum.thegradcafe.com/topic/34954-signature-for-school-email/?do=showRepComment&comment=1057870854

annecmg commented 4 years ago

Yes just to clarify for the fix you need a particular Desman branch. So:

git clone -b PandasRemove https://github.com/chrisquince/DESMAN.git

and then install that branch as you would normally.

I have not pushed these changes to the master branch as it could break installations using earlier Pandas versions. The best solution would be for me to recode everything without Pandas.

Hi Chris, is it possible to provide that version in conda or specify the panda dependencies in conda? On my system it is not possible to install it from source and when installing it from conda I get the following error:

Traceback (most recent call last): File "/opt/share/software/packages/desman-2.1/conda-env/bin/Variant_Filter.py", line 561, in main(sys.argv[1:]) File "/opt/share/software/packages/desman-2.1/conda-env/bin/Variant_Filter.py", line 514, in main mCogFilter = args.outlier_thresh,cogSampleFrac=args.sample_frac) File "/opt/share/software/packages/desman-2.1/conda-env/bin/Variant_Filter.py", line 74, in init variants_matrix = variants.as_matrix() File "/opt/share/software/packages/desman-2.1/conda-env/lib/python3.7/site-packages/pandas/core/generic.py", line 5274, in getattr return object.getattribute(self, name) AttributeError: 'DataFrame' object has no attribute 'as_matrix'

Thanks a lot! Anne

chrisquince commented 4 years ago

Hi Anne,

In the first instance I think we can provide a conda environment to be resolved. We will look into that now. Later we can then create a new conda package.

Thanks, Chris

annecmg commented 3 years ago

Hi Chris, could you already take a look in this issue? When creating a conda environment for desman, I still get this error. In addition, I noticed the following error when creating the environment:

zstd-1.4.5           | 712 KB    |                                                                                                                                                                                             |   0% Exception in thread Thread-3:
Traceback (most recent call last):
  File "/opt/share/software/packages/miniconda3-4.7.12/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/opt/share/software/packages/miniconda3-4.7.12/lib/python3.7/site-packages/conda/_vendor/tqdm/_monitor.py", line 86, in run
    if instances != self.tqdm_cls._instances:  # pragma: nocover
  File "/opt/share/software/packages/miniconda3-4.7.12/lib/python3.7/_weakrefset.py", line 172, in __eq__
    return self.data == set(map(ref, other))
  File "/opt/share/software/packages/miniconda3-4.7.12/lib/python3.7/site-packages/conda/_vendor/tqdm/_tqdm.py", line 872, in __eq__
    return abs(self.pos) == abs(other.pos)
AttributeError: 'tqdm' object has no attribute 'pos'

Conda still finishes successfully and I am not sure whether this has an impact on the installation.

Thanks, Anne

chrisquince commented 3 years ago

Hi Anne,

Yes sorry. The problem is we did not create nor maintain the Desman conda environment so as I mention above us providing a new recipe in the Desman distribution is the easier solution. We will get on that right away.

Thanks, Chris