parklab / HiNT

HiC for copy Number variation and Translocation detection
MIT License
35 stars 8 forks source link

Deprecated `.ix` Pandas method causes `hint cnv` to fail. #14

Open jrhawley opened 3 years ago

jrhawley commented 3 years ago

Hi Park lab,

I'll start by saying thanks for all the work put into this package and having it support a bunch of input file formats. That's really great and makes it easy for people to use this tool.

The issue

When running the cnv subcommand, the CNV callings starts off normally, but it fails when running the first set of rowsums. Here is an example output (cooler input file, hg38 genome).

...
chr1 chrX
Writing rowsums of chr1!
Traceback (most recent call last):
  File "/mnt/work1/users/home2/hawleyj/miniconda3/envs/hint/bin/hint", line 203, in <module>
    main()
  File "/mnt/work1/users/home2/hawleyj/miniconda3/envs/hint/bin/hint", line 196, in main
    cnvrun(argparser)
  File "/mnt/work1/users/home2/hawleyj/miniconda3/envs/hint/lib/python3.6/site-packages/HiNT/getGenomeRowSumsFromCool.py", line 31, in getSumPerChrom
    writeGenomeRowSums(coolfile,chromRowsums,chrom1,outputname,name)
  File "/mnt/work1/users/home2/hawleyj/miniconda3/envs/hint/lib/python3.6/site-packages/HiNT/getGenomeRowSumsFromCool.py", line36, in writeGenomeRowSums
    allbins = bins[bins['chrom']== chrom].ix[:0:3]
  File "/mnt/work1/users/home2/hawleyj/miniconda3/envs/hint/lib/python3.6/site-packages/pandas/core/generic.py", line 5141, in __getattr__
    reutnr object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'ix'

Potential problem source

The most likely culprit is the .ix pandas DataFrame method in this line This is because the .ix method became deprecated in v0.20.0 of Pandas, and was finally removed in the v1.0 release.

Possible solutions

Currently, in setup.py and requirements.txt, the Pandas version is specified as pandas>=0.23.0. This could be capped below v1.0.0 with pandas>=0.23.0,<1.0.0.

Or, to extend the longevity of the tool, the .ix method could be swapped out in favour of .iloc or a related method.

I don't know how extensive the issue may be, but searching the repository only returns the single instance in getGenomeRowSumsFromCool.py, so this may be a simple fix.

Installation configuration details, if helpful

Variable Value
OS CentOS Linux 7.5.1804
Installation method conda install -c su hint (inside a fresh conda environment)
Python version 3.6.13
Pandas version 1.1.5
7insong commented 1 month ago

Yes, I solved this issue by 'conda install pandas=0.23.0'.