error when running ./paprica-mg_run.py

ajiwahyu commented 5 years ago

Hi Jeff

I tried to run this command but an error came out, not really sure if my diamond version that needs to be updated or it is related to another problem. the fasta fille is a file that comes from your link

https://www.polarmicrobes.org/tutorial-annotating-metagenomes-with-paprica-mg/

./paprica-mg_run.py -i ERR318619_1.qc.fasta.gz -o test_mg -ref_dir ref_genome_database/ -pathways F

Error message

./paprica-mg_run.py:120: FutureWarning: from_csv is deprecated. Please use read_csv(...) instead. Note that some of the default arguments are different, so please refer to the documentation for from_csv when changing your function calls ec_df = pd.DataFrame.from_csv(ref_dir_path + 'paprica-mg.ec.csv') executing DIAMOND blastx, this might take a while... File "/data/home/awahyu/miniconda3/bin/diamond", line 113 print "Diamond version %s" % (get_diamond_version()) ^ SyntaxError: invalid syntax File "/data/home/awahyu/miniconda3/bin/diamond", line 113 print "Diamond version %s" % (get_diamond_version()) ^ SyntaxError: invalid syntax Traceback (most recent call last): File "./paprica-mg_run.py", line 138, in diamond_df = pd.read_csv(cwd + name + '.paprica-mg.nr.txt', sep = '\t', header = None, index_col = 0) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 702, in parser_f return _read(filepath_or_buffer, kwds) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 429, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 895, in init self._make_engine(self.engine) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 1122, in _make_engine self._engine = CParserWrapper(self.f, self.options) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 1853, in init self._reader = parsers.TextReader(src, **kwds) File "pandas/_libs/parsers.pyx", line 387, in pandas._libs.parsers.TextReader.cinit File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source IOError: [Errno 2] File /data/home/awahyu/paprica/test_mg.paprica-mg.nr.txt does not exist: '/data/home/awahyu/paprica/test_mg.paprica-mg.nr.txt'

Any suggestion on how to solve this problems

bowmanjeffs commented 5 years ago

Yes, it looks like the error is actually coming from Diamond, so you might try updating that. Are you using the provided paprica-mg database?

ajiwahyu commented 5 years ago

Hi Jeff

Thanks for the reply and yes Iam using the provided database.

Managed to work with the diamond problems but this issue came out. I believe this is related with the paprica-mg.dmnd database. Try to download it via this link https://www.polarmicrobes.org/tutorial-annotating-metagenomes-with-paprica-mg/ but got nothing from the link. Any suggestions?

./paprica-mg_run.py:120: FutureWarning: from_csv is deprecated. Please use read_csv(...) instead. Note that some of the default arguments are different, so please refer to the documentation for from_csv when changing your function calls ec_df = pd.DataFrame.from_csv(ref_dir_path + 'paprica-mg.ec.csv') executing DIAMOND blastx, this might take a while... Error: function Input_stream::Input_stream(const string&, bool) line 75. Error opening file /data/home/awahyu/paprica/paprica-mgt.database/ref_genome_database/paprica-mg.dmnd Error: function Input_stream::Input_stream(const string&, bool) line 75. Error opening file /data/home/awahyu/paprica/test_mg.paprica-mg.nr.daa Traceback (most recent call last): File "./paprica-mg_run.py", line 138, in diamond_df = pd.read_csv(cwd + name + '.paprica-mg.nr.txt', sep = '\t', header = None, index_col = 0) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 702, in parser_f return _read(filepath_or_buffer, kwds) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 429, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 895, in init self._make_engine(self.engine) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 1122, in _make_engine self._engine = CParserWrapper(self.f, self.options) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 1853, in init self._reader = parsers.TextReader(src, **kwds) File "pandas/_libs/parsers.pyx", line 387, in pandas._libs.parsers.TextReader.cinit File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source IOError: [Errno 2] File /data/home/awahyu/paprica/test_mg.paprica-mg.nr.txt does not exist: '/data/home/awahyu/paprica/test_mg.paprica-mg.nr.txt'

bowmanjeffs commented 5 years ago

I'm a little confused, you say you're using the provided database but it looks like you haven't downloaded it yet? Certainly the script won't work if you haven't downloaded it. The links in the tutorial are old. Use this database: https://www.polarmicrobes.org/extras/paprica-mgt.database.tgz, and be sure to read the documentation for the paprica-mg_run.py script!

ajiwahyu commented 5 years ago

Sorry, that was my mistake i thought paprica-mg.dmnd is a file from a different source and database.

Also, i faced the same problem as this issue https://github.com/bowmanjeffs/paprica/issues/39, Saying incompatible database version. Can you suggest me where to download the newer version one?

/data/home/awahyu/paprica/paprica-mg_run.py:120: FutureWarning: from_csv is deprecated. Please use read_csv(...) instead. Note that some of the default arguments are different, so please refer to the documentation for from_csv when changing your function calls ec_df = pd.DataFrame.from_csv(ref_dir_path + 'paprica-mg.ec.csv') executing DIAMOND blastx, this might take a while... Error: Incompatible database version Error: function Input_stream::Input_stream(const string&, bool) line 75. Error opening file /data/home/awahyu/paprica/test.paprica-mg.nr.daa Traceback (most recent call last): File "/data/home/awahyu/paprica/paprica-mg_run.py", line 138, in diamond_df = pd.read_csv(cwd + name + '.paprica-mg.nr.txt', sep = '\t', header = None, index_col = 0) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 702, in parser_f return _read(filepath_or_buffer, kwds) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 429, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 895, in init self._make_engine(self.engine) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 1122, in _make_engine self._engine = CParserWrapper(self.f, self.options) File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 1853, in init self._reader = parsers.TextReader(src, **kwds) File "pandas/_libs/parsers.pyx", line 387, in pandas._libs.parsers.TextReader.cinit File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source IOError: [Errno 2] File /data/home/awahyu/paprica/test.paprica-mg.nr.txt does not exist: '/data/home/awahyu/paprica/test.paprica-mg.nr.txt'

Thanks Aji

bowmanjeffs commented 5 years ago

Fixed an issue with paths, give it a try now. There may still be an issue with Diamond version but I think not...

ajiwahyu commented 5 years ago

The issue still persists. Used this path https://www.polarmicrobes.org/extras/paprica-mgt.database.tgz but the error still there

executing DIAMOND blastx, this might take a while... Error: Incompatible database version Error: function Input_stream::Input_stream(const string&, bool) line 75. Error opening file /data/home/awahyu/paprica/test_mg.paprica-mg.nr.daa Traceback (most recent call last): File "./paprica-mg_run.py", line 138, in diamond_df = pd.read_csv(cwd + name + '.paprica-mg.nr.txt', sep = '\t', header = None, index_col = 0)

my diamond version is 0.7.12

bowmanjeffs commented 5 years ago

The paprica-mgt.database directory should be untarred in the ref_genome_database directory. Give that a try...

ajiwahyu commented 5 years ago

i did that earlier but then the script is looking for paprica-mg.ec.csv file at this following location ~/paprica/paprica-mgt.database/ref_genome_database/paprica-mg.ec.csv ,which clearly not there since it is supposed to be at paprica/ref_genome_database/paprica-mgt.database/paprica-mg.ec.csv. I then moved the file to the path that the script wants but then this error came up

Error: Incompatible database version Error: function Input_stream::Input_stream(const string&, bool) line 75. Error opening file /data/home/awahyu/paprica/test.paprica-mg.nr.daa

File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source IOError: [Errno 2] File /data/home/awahyu/paprica/test.paprica-mg.nr.txt does not exist: '/data/home/awahyu/paprica/test.paprica-mg.nr.txt'.

as newbie python user, I am clearly baffled with what is going on here.

Thanks for keep replying Aji

bowmanjeffs commented 5 years ago

Yes, this is what I fixed. I had a bad path in the script. Try it with the latest version and see if you still have an issue. I am running a more recent version of Diamond but I don't think that's the issue.

On Tue, Jul 23, 2019, 2:11 PM ajiwahyu notifications@github.com wrote:

i did that earlier but then the script is looking for paprica-mg.ec.csv file at this following location ~/paprica/paprica-mgt.database/ref_genome_database/paprica-mg.ec.csv ,which clearly not there since it is supposed to be at paprica/ref_genome_database/paprica-mgt.database/paprica-mg.ec.csv. I then moved the file to the path that the script wants but then this error came up

Error: Incompatible database version Error: function Input_stream::Input_stream(const string&, bool) line 75. Error opening file /data/home/awahyu/paprica/test.paprica-mg.nr.daa

File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source IOError: [Errno 2] File /data/home/awahyu/paprica/test.paprica-mg.nr.txt does not exist: '/data/home/awahyu/paprica/test.paprica-mg.nr.txt'.

as newbie python user, I am clearly baffled with what is going on here.

Thanks for keep replying Aji

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bowmanjeffs/paprica/issues/68?email_source=notifications&email_token=AA4JHVBQV66DS5WHTKFTP3DQA5CU5A5CNFSM4IF44XH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2T7BUI#issuecomment-514322641, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4JHVASSNS6ZSPSJNQ4ZA3QA5CU5ANCNFSM4IF44XHQ .

bowmanjeffs / paprica

error when running ./paprica-mg_run.py #68