clb21565 / mobileOG-db

code repo for mobileOG-db
GNU General Public License v3.0
32 stars 5 forks source link

something wrong with the "mobileOG-pl/mobileOGs-pl-kyanite.py" #14

Closed Rooobben closed 11 months ago

Rooobben commented 1 year ago

When I tried to process the output of diamond with py files, the following error is reported and I can't solve it. Hope to get your help, thanks. python mobileOG-pl/mobileOGs-pl-kyanite.py --o /MobileOG/ --i 1_mobileOG.tsv -m mobileOG-db-beatrix-1.6-All.csv sys:1: DtypeWarning: Columns (7) have mixed types.Specify dtype option on import or set low_memory=False.

clb21565 commented 1 year ago

hey there, thanks for using the pipeline! This error actually will not do anything substantive as far as I can tell, and you should be getting the correct output. We will be fixing it in a future update, but, let me know if you're not getting the correct output.

Rooobben commented 1 year ago

Thanks for your reply. I could only use the DIAMOND to compare and output the tsv file compared to the MobileOG database. And when I used mobileOGs-pl-kyanite.sh, it took more than 48h and nothing to be output, so I gave up using the command.

clb21565 commented 1 year ago

Thanks for letting me know! How large was the dataset you used?

also: it took more than 48h and nothing to be output

The diamond output wasn't made?

PS, depending on what you have, I'd be happy to provide an accessory script for parsing the output free of charge. Thanks for using mobileOG-db!

Rooobben commented 1 year ago

Thanks for your reply. The code I used: diamond makedb --in mobileOG-db_beatrix-1.6.All.faa -d mobileOG.dmnd

diamond blastp -q blast_1.fa --db mobileOG --outfmt 6 stitle qtitle pident bitscore slen evalue qlen sstart send qstart qend -o mobileOG.tsv -e 1e-10 --query-cover 70 --id 70

which blast_1.fa is 282.59KB, and mobileOG.tsv could be exported.

Then: python mobileOGs-pl-kyanite.py --o /mobileOG --i mobileOG.tsv -m mobileOG-db-beatrix-1.6-All.csv

I got sys:1: DtypeWarning: Columns (7) have mixed types.Specify dtype option on import or set low_memory=False.

So I tried to use: chmod +x mobileOGs-pl-kyanite.sh

./mobileOGs-pl-kyanite.sh -i blast_1.fa -d mobileOG.dmnd -m mobileOG-db-beatrix-1.X.All.csv -k 15 -e 1e-10 -p 70 -q 70

It took more than 48h and nothing to be exported.

clb21565 commented 1 year ago

thanks for sharing this info! In command 2, it looks like the issue is:

python mobileOGs-pl-kyanite.py --o /mobileOG --i mobileOG.tsv -m mobileOG-db-beatrix-1.6-All.csv

it looks like the issue is:

--o /mobileOG

in the code above, you're writing your output to the / (root) directory. In the first snippet (the one that worked) you were writing the output to mobileOG.tsv (i.e., ./mobileOG.tsv) which would write to whatever pwd is.

if you cd into / , I reckon the output will be there. To fix it, you can just remove the forward slash and it should write to pwd like the first one did

in the last command, you had -m mobileOG-db-beatrix-1.X.All.csv when you should have had -m mobileOG-db-beatrix-1.6-All.csv like the other commands. Sometimes when linux can't find a file it will just cycle endlessly with no warning (I am not sure why)-- this is my guess as to what happened.

if this was an issue of documentation, please let me know how it can be improved! thanks again and please let me know if this does not fix the issue.

Rooobben commented 1 year ago

Sorry,it seems that you won't understand what i mean.

for --o /mobileOG , I just omitted the cumbersome paths, which are still intact when I do the command entry. \home\users\data\mobileOG\mobileOG

And the error about sys:1: DtypeWarning: Columns (7) have mixed types.Specify dtype option on import or set low_memory=False seems to be for my input file mobileOG.tsv, which has a formatting problem in column 7, but I can't fix it.

Thanks a lot.

clb21565 commented 1 year ago

sys:1: DtypeWarning: Columns (7) have mixed types.Specify dtype option on import or set low_memory=False s

this error is pretty common and should not impact the output. Odd that there was no output after 48h. Without access to your data, there is not much I can do unfortunately.

Just wanted to confirm you did run:

-m mobileOG-db-beatrix-1.6-All.csv

in the last command, yes?

Rooobben commented 1 year ago

Yes, I actually run -m mobileOG-db-beatrix-1.6-All.csv. And this error impacts the output, when this error was generated, the command was automatically terminated and no output be produced.

balaram26 commented 11 months ago

Hi not sure if you still having the issue, I had the same issue but I was able to debug it, the issue comes from the python script.I have made the changes and created pull request to fix this bug.

clb21565 commented 11 months ago

thanks! updated the repo.