EI-CoreBioinformatics / portcullis

Splice junction analysis and filtering from BAM files
https://ei-corebioinformatics.github.io/portcullis/
GNU General Public License v3.0
38 stars 9 forks source link

AttributeError: type object 'DataFrame' has no attribute 'from_csv' #50

Open lijing28101 opened 4 years ago

lijing28101 commented 4 years ago

Hi, my code is

portcullis full --threads 28 --verbose --use_csi --output portcullis_out --orientation FR Saccharomyces_cerevisiae.fa Saccharomyces_cerevisiae_rnaseq_sorted.bam

When I run to the filter step, it shows the error

xecuting python script with this command: portcullis/rule_filter.py portcullis/rule_filter.py --pos_json //pylon5/mc5pl7p/lijingqb/bin/anaconda3/envs/mikado/share/portcullis/balanced/selftrain_initial_pos.layer1.json //pylon5/mc5pl7p/lijingqb/bin/anaconda3/envs/mikado/share/portcullis/balanced/selftrain_initial_pos.layer2.json //pylon5/mc5pl7p/lijingqb/bin/anaconda3/envs/mikado/share/portcullis/balanced/selftrain_initial_pos.layer3.json --neg_json //pylon5/mc5pl7p/lijingqb/bin/anaconda3/envs/mikado/share/portcullis/balanced/selftrain_initial_neg.layer1.json //pylon5/mc5pl7p/lijingqb/bin/anaconda3/envs/mikado/share/portcullis/balanced/selftrain_initial_neg.layer2.json //pylon5/mc5pl7p/lijingqb/bin/anaconda3/envs/mikado/share/portcullis/balanced/selftrain_initial_neg.layer3.json //pylon5/mc5pl7p/lijingqb/bin/anaconda3/envs/mikado/share/portcullis/balanced/selftrain_initial_neg.layer4.json //pylon5/mc5pl7p/lijingqb/bin/anaconda3/envs/mikado/share/portcullis/balanced/selftrain_initial_neg.layer5.json //pylon5/mc5pl7p/lijingqb/bin/anaconda3/envs/mikado/share/portcullis/balanced/selftrain_initial_neg.layer6.json //pylon5/mc5pl7p/lijingqb/bin/anaconda3/envs/mikado/share/portcullis/balanced/selftrain_initial_neg.layer7.json --prefix=portcullis_out/3-filt/portcullis_filtered.selftrain.initialset portcullis_out/2-junc/portcullis_all.junctions.tab
Loading input junctions ... Traceback (most recent call last):
  File "//pylon5/mc5pl7p/lijingqb/bin/anaconda3/envs/mikado/share/portcullis/scripts/portcullis/rule_filter.py", line 392, in <module>
    if __name__=='__main__': main()
  File "//pylon5/mc5pl7p/lijingqb/bin/anaconda3/envs/mikado/share/portcullis/scripts/portcullis/rule_filter.py", line 385, in main
    create_training_sets(args)
  File "//pylon5/mc5pl7p/lijingqb/bin/anaconda3/envs/mikado/share/portcullis/scripts/portcullis/rule_filter.py", line 127, in create_training_sets
    original = DataFrame.from_csv(args.input, sep='\t', header=0)
AttributeError: type object 'DataFrame' has no attribute 'from_csv'

Portcullis completed.
Total runtime: 69.6s

../lib/include/portcullis/python_helper.hpp(167): Throw in function void portcullis::PyHelper::execute(std::__cxx11::string, int, char**)
Dynamic exception type: boost::exception_detail::clone_impl<portcullis::PortcullisPythonException>
std::exception::what: std::exception

I found that DataFrame doesn't have from_csv any more, it should be pd.read_csv. I manually revised the code in rule_filter.py. However, it gave me a new error

src/junction.cc(1166): Throw in function static std::shared_ptr<portcullis::Junction> portcullis::Junction::parse(const string&)
Dynamic exception type: boost::exception_detail::clone_impl<portcullis::JunctionException>
std::exception::what: std::exception
[portcullis::JunctionError*] = Could not parse line due to incorrect number of columns.  This is probably a version mismatch.  Check file and portcullis versions.  Expected 75 columns.  Found 76.  Line: 37   37  0   I   230218  87387   87499   113 87312   87574   +   +   +   GT  AG  C   0   0   0   2191    541 2191    0   2191    0   2169    2142149 0.9808309999999999  0   1503    688 0   4.88665 0.0826107   75  38  38  0   10  5   0   4808    10375   4808    0   0   0   0   1   2185    2161    2123    2083    2035    1993    1969    1953    1944    1901864 1852    1810    1790    1749    1396    1361    1318    1082    871

I use pyhton 3.6 Please help me to solve the problem. Thanks!

lucventurini commented 4 years ago

Dear @lijing28101 , it appears you are using an older version of Portcullis - we solved this bug about half a year ago. How did you install the program?

In case you are using BioConda to install it, we did update Portcullis there. A conda update -c bioconda portcullis should install the latest, fixed version.

Kind regards

Luca Venturini

lijing28101 commented 4 years ago

Hi @lucventurini , I cannot install version 1.2.2. through conda. It shows package conflict. But my environment just install mikado. I think their dependency should be consistent.

Package openssl conflicts for:
defaults::python[version='>=3.6'] -> openssl[version='1.0.*|1.0.*,>=1.0.2l,<1.0.3a|>=1.0.2m,<1.0.3a|>=1.0.2n,<1.0.3a|>=1.0.2o,<1.0.3a|>=1.0.2p,<1.0.3a|>=1.1.1a,<1.1.2a|>=1.1.1b,<1.1.2a|>=1.1.1c,<1.1.2a|>=1.1.1d,<1.1.2a']
Package samtools conflicts for:
portcullis=1.2.2 -> samtools[version='>=1.9']
Package tk conflicts for:
defaults::python[version='>=3.6'] -> tk[version='8.6.*|>=8.6.7,<8.7.0a0|>=8.6.8,<8.7.0a0']
Package readline conflicts for:
defaults::python[version='>=3.6'] -> readline[version='7.*|>=7.0,<8.0a0']
Package libstdcxx-ng conflicts for:
portcullis=1.2.2 -> libstdcxx-ng[version='>=7.3.0']
defaults::python[version='>=3.6'] -> libstdcxx-ng[version='>=7.2.0|>=7.3.0']
Package libgcc-ng conflicts for:
portcullis=1.2.2 -> libgcc-ng[version='>=7.3.0']
defaults::python[version='>=3.6'] -> libgcc-ng[version='>=7.2.0|>=7.3.0']
Package boost-cpp conflicts for:
portcullis=1.2.2 -> boost-cpp[version='>=1.70.0,<1.70.1.0a0']
Package ncurses conflicts for:
defaults::python[version='>=3.6'] -> ncurses[version='6.0.*|>=6.0,<7.0a0|>=6.1,<7.0a0']
Package pandas conflicts for:
portcullis=1.2.2 -> pandas
Package pip conflicts for:
defaults::python[version='>=3.6'] -> pip
Package zlib conflicts for:
portcullis=1.2.2 -> zlib[version='>=1.2.11,<1.3.0a0']
defaults::python[version='>=3.6'] -> zlib[version='>=1.2.11,<1.3.0a0']
Package tabulate conflicts for:
portcullis=1.2.2 -> tabulate
Package sqlite conflicts for:
defaults::python[version='>=3.6'] -> sqlite[version='>=3.20.1,<4.0a0|>=3.22.0,<4.0a0|>=3.23.1,<4.0a0|>=3.24.0,<4.0a0|>=3.25.2,<4.0a0|>=3.25.3,<4.0a0|>=3.26.0,<4.0a0|>=3.27.2,<4.0a0|>=3.29.0,<4.0a0|>=3.30.0,<4.0a0|>=3.30.1,<4.0a0']
Package libffi conflicts for:
defaults::python[version='>=3.6'] -> libffi[version='3.2.*|>=3.2.1,<4.0a0']
Package numpy conflicts for:
portcullis=1.2.2 -> numpy
Package xz conflicts for:
defaults::python[version='>=3.6'] -> xz[version='>=5.2.3,<6.0a0|>=5.2.4,<6.0a0']

Best, Jing

lucventurini commented 4 years ago

Dear @lijing28101, sorry to hear that. I cannot replicate the problem on my end, however, as the following functioned:

$ conda env remove -n mikado2  # Remove mikado environment if present
$ git clone https://github.com/EI-CoreBioinformatics/mikado.git
$ cd mikado
$ conda env create -f environment.yml  # Create Mikado2 environment
$ conda activate mikado2
$ conda install -c bioconda portcullis  # Showing the log
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: done

==> WARNING: A newer version of conda exists. <==
  current version: 4.7.12
  latest version: 4.8.1

Please update conda by running

    $ conda update -n base -c defaults conda

## Package Plan ##

  environment location: /software/stable/python/.conda/x86_64/envs/mikado2

  added / updated specs:
    - portcullis

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    certifi-2019.11.28         |           py37_0         153 KB
    htslib-1.10.2              |       h78d89cc_0         1.7 MB  bioconda
    portcullis-1.1.2           |   py37hdbcaa40_0        21.5 MB  bioconda
    samtools-1.10              |       h9402c20_2         343 KB  bioconda
    ------------------------------------------------------------
                                           Total:        23.6 MB

The following NEW packages will be INSTALLED:

  htslib             bioconda/linux-64::htslib-1.10.2-h78d89cc_0
  portcullis         bioconda/linux-64::portcullis-1.1.2-py37hdbcaa40_0
  samtools           bioconda/linux-64::samtools-1.10-h9402c20_2

The following packages will be UPDATED:

  openssl            conda-forge::openssl-1.1.1d-h516909a_0 --> pkgs/main::openssl-1.1.1d-h7b6447c_3

The following packages will be SUPERSEDED by a higher-priority channel:

  certifi                                       conda-forge --> pkgs/main

Proceed ([y]/n)? y

Downloading and Extracting Packages
samtools-1.10        | 343 KB    | ############################################################################################################################################# | 100% 
portcullis-1.1.2     | 21.5 MB   | ############################################################################################################################################# | 100% 
htslib-1.10.2        | 1.7 MB    | ############################################################################################################################################# | 100% 
certifi-2019.11.28   | 153 KB    | ############################################################################################################################################# | 100% 
Preparing transaction: done
Verifying transaction: done
Executing transaction: done
$ pip install dist/Mikado-2.0rc6-cp37-cp37m-linux_x86_64.whl
$ mikado --help
$ portcullis --help

I hope this helps in solving the installation problem on your end.

Kind regards

Luca Venturini

lijing28101 commented 4 years ago

Hi @lucventurini , this method install the version 1.1.2, not 1.2.2. And this version has the DataFrame error. I've tried install 1.2.2 from source code, but it also failed when I run make -j 2

urmi-21 commented 4 years ago

@lucventurini I deleted my previous comment as @lijing28101 already pointed out that from_csv() is now deprecated.

lucventurini commented 4 years ago

Dear @lijing28101 , @urmi-21 , thank you for your feedback. I am currently revising the Mikado environment definition file to allow installation of both programs in the same conda environment.

Please bear with me while I correct it.

Kind regards

Luca Venturini

lucventurini commented 4 years ago

Dear @lijing28101 , @urmi-21 , the following workaround functions for now: add

  - bioconda::portcullis>=1.2.2

on the dependency list of the environment.yml file I indicated earlier, then create the environment. This will force conda to install the right versions of the packages

I think that the problem is the fact the portcullis package was created six months ago and therefore its package versions were frozen to earlier versions of zlib. This is quite weird, as the recipe itself does not indicate a specific version for any requirement. I will try to solve the problem on the channel today.

Thank you again for reporting the problem.

lucventurini commented 4 years ago

Update: the problem stems from both zlib and boost having been updated, while the portcullis package requires the earlier version.

francicco commented 4 years ago

Hi,

I'm having the same issue. Did you menage to fix it?

Cheers F