zavolanlab / MIRZAG

MIRZA-G - Pipeline and model for miRNA target prediction
5 stars 3 forks source link

rg_count_miRNA_seeds_and_filter_duplicates.py: drop_duplicates() got an unexpected keyword argument 'cols' #4

Closed sujaikumar closed 5 years ago

sujaikumar commented 6 years ago

Managed to get all the dependencies and jobber installed and running finally :-)

To test the installation, I followed the instructions at http://mirzag.readthedocs.io/en/latest/usage.html#example - I ran this:

cd /path/to/MIRZAG/tests
bash rg_run_test.sh run

and got this output:

Running tests
MIRZA is not set: setting default (MIRZA).
CONTRAfold is not set: setting default (contrafold).
In order to stop the pipeline run a command:
jobber_server -command delete -jobId 6

An output folder was created with these files:

$ ls -1 output
hsa-miR-1972.fa
hsa-miR-1973.fa
hsa-miR-1976.fa
hsa-miR-495-3p.fa
hsa-miR-6879-5p.fa

But no further files were created. So I looked in the log files in ~/.jobber/log/0/ and found the same errors for each of the 5 jobs that were running the rg_count_miRNA_seeds_and_filter_duplicates.py script:

This is the command:

cat ~/.jobber/log/0/0/12/1/command.sh
#!/bin/bash

date +%s > /ceph/users/skumar/.jobber/log/0/0/12/1/start
python /ceph/users/skumar/MIRZAG.2018-04-13/scripts/rg_count_miRNA_seeds_and_filter_duplicates.py \
                                        --motifs /ceph/users/skumar/MIRZAG.2018-04-13/tests/output/hsa-miR-1976.fa \
                                        --seqs /ceph/users/skumar/MIRZAG.2018-04-13/tests/utrs.fa \
                                        --output /ceph/users/skumar/MIRZAG.2018-04-13/tests/output/hsa-miR-1976.seedcount \
                                        --how TargetScan \
                                        --context 50 \
                                        --split-by "|" \
                                        --index-after-split 1 \
                                        -v

exitStatus=$?
date +%s > /ceph/users/skumar/.jobber/log/0/0/12/1/end
exit ${exitStatus}

And this is the error:

cat ~/.jobber/log/0/0/12/1/err
############## Started script on 17-04-2018 at 13:52:56 ##############
Traceback (most recent call last):
  File "/ceph/users/skumar/MIRZAG.2018-04-13/scripts/rg_count_miRNA_seeds_and_filter_duplicates.py", line 183, in <module>
    main(options)
  File "/ceph/users/skumar/MIRZAG.2018-04-13/scripts/rg_count_miRNA_seeds_and_filter_duplicates.py", line 153, in main
    ndf = data.drop_duplicates(cols=['newid', 'mirna', 'seq'])
TypeError: drop_duplicates() got an unexpected keyword argument 'cols'

I could be wrong but this sounds like a problem with the script, not with the dependencies. I don't know enough python to debug this, but if you have any suggestions, I'm happy to try them.

guma44 commented 6 years ago

Hi, What version of pandas did you installed? This should still work for pandas 0.14 but it was is not available now.

sujaikumar commented 6 years ago

I have pandas 0.22.0

guma44 commented 6 years ago

I suggest to downgrade pandas to 0.14 or the latest possible version that supports cols parameter.

sujaikumar commented 6 years ago

Thanks. Will try that!