shahab-sarmashghi / RESPECT

Estimating repeat spectra and genome length from low-coverage genome skims
Other
11 stars 1 forks source link

DataFrame.append is not part of the latest pandas #13

Closed KamilSJaron closed 1 year ago

KamilSJaron commented 1 year ago

Hi folks,

I finally got around to try benchmarking RESPECT, thanks for making it, I am very curious how it will perform on our low coverage k-mer spectra (when individual coverage peaks are not visible, genomescope is totally useless). I will keep you posted on how it goes.

But first, I run into a silly problem.

respect -k 31 --threads 4 -i ddAraThal4.k31.hist -I ddAraThal4_files.tab 
2023-03-29 16:39:06,476 INFO:Processing ddAraThal4.k31.hist...
2023-03-29 16:39:06,719 INFO:estimate_genome_skim_parameters finished in 0.21167302131652832 seconds
2023-03-29 16:39:06,719 ERROR:Error occurred when estimating parameters for /l.../ddAraThal4/ddAraThal4.k31.hist; it's skipped
...
<Traceback>
...
AttributeError: 'DataFrame' object has no attribute 'append'
2023-03-29 16:39:06,723 ERROR:Error occurred while trying to get estimated parameters for a sample
2023-03-29 16:39:06,726 INFO:Writing the results to the output files...

I think the problem is that I am using too new pandas (which is the default installed by conda these days) which already has no append method on DataFrames: see the depreciation notice.

I am pretty sure that's the source, downgrading pandas actually made the error disappear. In case you don't want to tweak with the code, you can just update the installation procedure...

Update

This is what actually worked

mamba create -n "respect" python=3.10.0
mamba activate respect
mamba install pandas=1.4.0=py310hb5077e9_0
conda install jellyfish seqtk gurobi 
python setup.py install

(this causes a lot lot warnings though and the output is a bit wild, perhaps you should pick even older pandas version)

Update 2

Now checking the install more closely, it's most likely that creating a conda env with a one of the python versions you mentioned in README would resolve all the problems.

jflot commented 1 year ago

Hi Kamil & everyone, As an update on this issue: creating a conda env with python 3.8 (the most recent version mention in the README) did not resolve the pandas version issue (it is still pandas version 2.0.1 that got installed). However, creating a conda env with python 3.6 did the job (it is pandas 0.22 that got installed, then): conda create -n respect -c bioconda -c https://conda.anaconda.org/gurobi python=3.6 jellyfish seqtk gurobi, conda activate respect then git clone https://github.com/shahab-sarmashghi/RESPECT.git, cd RESPECT and finally python setup.py install.

shahab-sarmashghi commented 1 year ago

Thank you @KamilSJaron and @jflot, I will change the pandas version requirements

shahab-sarmashghi commented 1 year ago

Hi folks, just wanted to let you know that I set the prerequisite versions to specific versions that I had tested for python 3.7 and 3.8, so you should no longer need to manually downgrade any dependencies.

shahab-sarmashghi commented 1 year ago

It turned out my previous fix was incomplete and there were still issues with some combinations of versions. Had to drop support for any Python version other than 3.7 or 3.8, and limit the versions of some dependencies. Fully tested now, so hopefully there should not be any further issues.