Closed ajiwahyu closed 4 years ago
Yes, it looks like the error is actually coming from Diamond, so you might try updating that. Are you using the provided paprica-mg database?
Hi Jeff
Thanks for the reply and yes Iam using the provided database.
Managed to work with the diamond problems but this issue came out. I believe this is related with the paprica-mg.dmnd database. Try to download it via this link https://www.polarmicrobes.org/tutorial-annotating-metagenomes-with-paprica-mg/ but got nothing from the link. Any suggestions?
./paprica-mg_run.py:120: FutureWarning: from_csv is deprecated. Please use read_csv(...) instead. Note that some of the default arguments are different, so please refer to the documentation for from_csv when changing your function calls
ec_df = pd.DataFrame.from_csv(ref_dir_path + 'paprica-mg.ec.csv')
executing DIAMOND blastx, this might take a while...
Error: function Input_stream::Input_stream(const string&, bool) line 75. Error opening file /data/home/awahyu/paprica/paprica-mgt.database/ref_genome_database/paprica-mg.dmnd
Error: function Input_stream::Input_stream(const string&, bool) line 75. Error opening file /data/home/awahyu/paprica/test_mg.paprica-mg.nr.daa
Traceback (most recent call last):
File "./paprica-mg_run.py", line 138, in
I'm a little confused, you say you're using the provided database but it looks like you haven't downloaded it yet? Certainly the script won't work if you haven't downloaded it. The links in the tutorial are old. Use this database: https://www.polarmicrobes.org/extras/paprica-mgt.database.tgz, and be sure to read the documentation for the paprica-mg_run.py script!
Sorry, that was my mistake i thought paprica-mg.dmnd is a file from a different source and database.
Also, i faced the same problem as this issue https://github.com/bowmanjeffs/paprica/issues/39, Saying incompatible database version. Can you suggest me where to download the newer version one?
/data/home/awahyu/paprica/paprica-mg_run.py:120: FutureWarning: from_csv is deprecated. Please use read_csv(...) instead. Note that some of the default arguments are different, so please refer to the documentation for from_csv when changing your function calls
ec_df = pd.DataFrame.from_csv(ref_dir_path + 'paprica-mg.ec.csv')
executing DIAMOND blastx, this might take a while...
Error: Incompatible database version
Error: function Input_stream::Input_stream(const string&, bool) line 75. Error opening file /data/home/awahyu/paprica/test.paprica-mg.nr.daa
Traceback (most recent call last):
File "/data/home/awahyu/paprica/paprica-mg_run.py", line 138, in
Thanks Aji
Fixed an issue with paths, give it a try now. There may still be an issue with Diamond version but I think not...
The issue still persists. Used this path https://www.polarmicrobes.org/extras/paprica-mgt.database.tgz but the error still there
executing DIAMOND blastx, this might take a while...
Error: Incompatible database version
Error: function Input_stream::Input_stream(const string&, bool) line 75. Error opening file /data/home/awahyu/paprica/test_mg.paprica-mg.nr.daa
Traceback (most recent call last):
File "./paprica-mg_run.py", line 138, in
my diamond version is 0.7.12
The paprica-mgt.database directory should be untarred in the ref_genome_database directory. Give that a try...
i did that earlier but then the script is looking for paprica-mg.ec.csv file at this following location ~/paprica/paprica-mgt.database/ref_genome_database/paprica-mg.ec.csv ,which clearly not there since it is supposed to be at paprica/ref_genome_database/paprica-mgt.database/paprica-mg.ec.csv. I then moved the file to the path that the script wants but then this error came up
Error: Incompatible database version Error: function Input_stream::Input_stream(const string&, bool) line 75. Error opening file /data/home/awahyu/paprica/test.paprica-mg.nr.daa
File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source IOError: [Errno 2] File /data/home/awahyu/paprica/test.paprica-mg.nr.txt does not exist: '/data/home/awahyu/paprica/test.paprica-mg.nr.txt'.
as newbie python user, I am clearly baffled with what is going on here.
Thanks for keep replying Aji
Yes, this is what I fixed. I had a bad path in the script. Try it with the latest version and see if you still have an issue. I am running a more recent version of Diamond but I don't think that's the issue.
On Tue, Jul 23, 2019, 2:11 PM ajiwahyu notifications@github.com wrote:
i did that earlier but then the script is looking for paprica-mg.ec.csv file at this following location ~/paprica/paprica-mgt.database/ref_genome_database/paprica-mg.ec.csv ,which clearly not there since it is supposed to be at paprica/ref_genome_database/paprica-mgt.database/paprica-mg.ec.csv. I then moved the file to the path that the script wants but then this error came up
Error: Incompatible database version Error: function Input_stream::Input_stream(const string&, bool) line 75. Error opening file /data/home/awahyu/paprica/test.paprica-mg.nr.daa
File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source IOError: [Errno 2] File /data/home/awahyu/paprica/test.paprica-mg.nr.txt does not exist: '/data/home/awahyu/paprica/test.paprica-mg.nr.txt'.
as newbie python user, I am clearly baffled with what is going on here.
Thanks for keep replying Aji
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bowmanjeffs/paprica/issues/68?email_source=notifications&email_token=AA4JHVBQV66DS5WHTKFTP3DQA5CU5A5CNFSM4IF44XH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2T7BUI#issuecomment-514322641, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4JHVASSNS6ZSPSJNQ4ZA3QA5CU5ANCNFSM4IF44XHQ .
Hi Jeff
I tried to run this command but an error came out, not really sure if my diamond version that needs to be updated or it is related to another problem. the fasta fille is a file that comes from your link
https://www.polarmicrobes.org/tutorial-annotating-metagenomes-with-paprica-mg/
./paprica-mg_run.py -i ERR318619_1.qc.fasta.gz -o test_mg -ref_dir ref_genome_database/ -pathways F
Error message
./paprica-mg_run.py:120: FutureWarning: from_csv is deprecated. Please use read_csv(...) instead. Note that some of the default arguments are different, so please refer to the documentation for from_csv when changing your function calls ec_df = pd.DataFrame.from_csv(ref_dir_path + 'paprica-mg.ec.csv') executing DIAMOND blastx, this might take a while... File "/data/home/awahyu/miniconda3/bin/diamond", line 113 print "Diamond version %s" % (get_diamond_version()) ^ SyntaxError: invalid syntax File "/data/home/awahyu/miniconda3/bin/diamond", line 113 print "Diamond version %s" % (get_diamond_version()) ^ SyntaxError: invalid syntax Traceback (most recent call last): File "./paprica-mg_run.py", line 138, in
diamond_df = pd.read_csv(cwd + name + '.paprica-mg.nr.txt', sep = '\t', header = None, index_col = 0)
File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 702, in parser_f
return _read(filepath_or_buffer, kwds)
File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 429, in _read
parser = TextFileReader(filepath_or_buffer, kwds)
File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 895, in init
self._make_engine(self.engine)
File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 1122, in _make_engine
self._engine = CParserWrapper(self.f, self.options)
File "/usr/lib64/python2.7/site-packages/pandas/io/parsers.py", line 1853, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 387, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 705, in pandas._libs.parsers.TextReader._setup_parser_source
IOError: [Errno 2] File /data/home/awahyu/paprica/test_mg.paprica-mg.nr.txt does not exist: '/data/home/awahyu/paprica/test_mg.paprica-mg.nr.txt'
Any suggestion on how to solve this problems