Closed linsalrob closed 2 years ago
Great you did this as I am having trouble to get the DBs work. I downloaded https://edwards.sdsu.edu/SUPERFOCUS/downloads/conda/diamond_v1/98_clusters.db.dmnd.zip, however, I do get the warning
Error: Database was built with a different version of Diamond and is incompatible.
diamond v0.9.24.125 | by Benjamin Buchfink <buchfink@gmail.com>
Did I misunderstand your instructions? Which file should work with diamond v0.9.24.125? FWY, I also tried https://edwards.sdsu.edu/SUPERFOCUS/downloads/conda/diamond_v3/98_clusters.db.dmnd.zip with the same error. I would be very glad if you can help me to get the DB running. So frustrating. I even tried to install SF with pip, conda and cloneng the git but always get error when trying to find the DB in the run. BTW, all ways I installed SF it turned out as version 0.0.0. Not sure whether this is an issue...
Hi @StefPN, it looks to me that the DIAMOND version you used is different from the one @linsalrob. You can always download the raw FASTA and format it yourself as it has on the tool README file.
Let me know how it goes
Thanks Geni! Unfortunately, that did not work. No diamond folder was created in the static directory. For neither superfocus I installed. All of them, using pip, your git or conda were version 0.0.0. they seem to work, but do not find the db as the formating following your instructions does not seem to create the necessary folders and directories for me. that is why I tried this approach. So what can I do now? Get Outlook for Androidhttps://aka.ms/ghei36
From: Geni Silva notifications@github.com Sent: Friday, August 21, 2020 10:15:46 PM To: metageni/SUPER-FOCUS SUPER-FOCUS@noreply.github.com Cc: Stefanie Prast-Nielsen stefanie.prast-nielsen@ki.se; Mention mention@noreply.github.com Subject: Re: [metageni/SUPER-FOCUS] smaller downloadDBs (#66)
Hi @StefPNhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FStefPN&data=02%7C01%7Cstefanie.prast-nielsen%40ki.se%7C68072704e4ca41fee07208d8460efdc9%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C637336377484724118&sdata=0W2UIpc8JNNEtWCW%2BOx9171j4xTNbSjed9mFB1xlDe4%3D&reserved=0, it looks to me that the DIAMOND version you used is different from the one @linsalrobhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flinsalrob&data=02%7C01%7Cstefanie.prast-nielsen%40ki.se%7C68072704e4ca41fee07208d8460efdc9%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C637336377484724118&sdata=sOKHLvaMduWhMdj4qqQcAfwLqFszoaliP4TChzwzZ1k%3D&reserved=0. You can always download the raw FASTA and format it yourself as it has on the tool README file.
Let me know how it goes
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmetageni%2FSUPER-FOCUS%2Fissues%2F66%23issuecomment-678473702&data=02%7C01%7Cstefanie.prast-nielsen%40ki.se%7C68072704e4ca41fee07208d8460efdc9%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C637336377484734113&sdata=qFZ8LKSfr2qHoUsXei1ooySwbJRHFpDgluh2QzPboFc%3D&reserved=0, or unsubscribehttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABJNBZRRLNWDSW22WNEW7HLSB3IXFANCNFSM4PWIXWWA&data=02%7C01%7Cstefanie.prast-nielsen%40ki.se%7C68072704e4ca41fee07208d8460efdc9%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C637336377484734113&sdata=2aWqZJvRCeZhWFo1IvcjOCyj62BrFMnTpOpTqfwvMUU%3D&reserved=0.
När du skickar e-post till Karolinska Institutet (KI) innebär detta att KI kommer att behandla dina personuppgifter. Här finns information om hur KI behandlar personuppgifterhttps://ki.se/medarbetare/integritetsskyddspolicy.
Sending email to Karolinska Institutet (KI) will result in KI processing your personal data. You can read more about KI’s processing of personal data herehttps://ki.se/en/staff/data-protection-policy.
@StefPN Can you please try this?
I am not sure what you mean. The link gets me to the same problem I have but I do not see a solution there?
Get Outlook for Androidhttps://aka.ms/ghei36
From: Geni Silva notifications@github.com Sent: Friday, August 21, 2020 10:33:01 PM To: metageni/SUPER-FOCUS SUPER-FOCUS@noreply.github.com Cc: Stefanie Prast-Nielsen stefanie.prast-nielsen@ki.se; Mention mention@noreply.github.com Subject: Re: [metageni/SUPER-FOCUS] smaller downloadDBs (#66)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmetageni%2FSUPER-FOCUS%2Fissues%2F66%23issuecomment-678480605&data=02%7C01%7Cstefanie.prast-nielsen%40ki.se%7C3dd86e7068b84c7e591908d846116736%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C637336387847825053&sdata=RJ2IbGMzFSLrazC%2FXjg5UlFlTzlDz65wJpEiwDuLDGA%3D&reserved=0, or unsubscribehttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABJNBZVJB3WMJQTATDSJJ7TSB3KX3ANCNFSM4PWIXWWA&data=02%7C01%7Cstefanie.prast-nielsen%40ki.se%7C3dd86e7068b84c7e591908d846116736%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C637336387847830056&sdata=ayILduI%2F0Pl5mpWY9FEuhtiq3TZMory3gSr9TciUiWI%3D&reserved=0.
När du skickar e-post till Karolinska Institutet (KI) innebär detta att KI kommer att behandla dina personuppgifter. Här finns information om hur KI behandlar personuppgifterhttps://ki.se/medarbetare/integritetsskyddspolicy.
Sending email to Karolinska Institutet (KI) will result in KI processing your personal data. You can read more about KI’s processing of personal data herehttps://ki.se/en/staff/data-protection-policy.
@StefPN I'm at work right now. I will get back to you with more details this weekend, ok?
I would be very glad! Thank you!
Get Outlook for Androidhttps://aka.ms/ghei36
From: Geni Silva notifications@github.com Sent: Friday, August 21, 2020 10:51:44 PM To: metageni/SUPER-FOCUS SUPER-FOCUS@noreply.github.com Cc: Stefanie Prast-Nielsen stefanie.prast-nielsen@ki.se; Mention mention@noreply.github.com Subject: Re: [metageni/SUPER-FOCUS] smaller downloadDBs (#66)
@StefPNhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FStefPN&data=02%7C01%7Cstefanie.prast-nielsen%40ki.se%7Cca4c272fb3704c1b2e3d08d8461404ad%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C637336399076666737&sdata=d2ygNx%2FDVx%2BhORB5IA2TQYNXOFUV%2BfAsYDn2wZuBzfw%3D&reserved=0 I'm at work right now. I will get back to you with more details this weekend, ok?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmetageni%2FSUPER-FOCUS%2Fissues%2F66%23issuecomment-678487560&data=02%7C01%7Cstefanie.prast-nielsen%40ki.se%7Cca4c272fb3704c1b2e3d08d8461404ad%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C637336399076671719&sdata=s0xhvmuBW40hhqflsx0NF%2BhH%2B38QhYr6TqcB91Swo8I%3D&reserved=0, or unsubscribehttps://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABJNBZRDQUZQ5EJXR6MLLTTSB3M6BANCNFSM4PWIXWWA&data=02%7C01%7Cstefanie.prast-nielsen%40ki.se%7Cca4c272fb3704c1b2e3d08d8461404ad%7Cbff7eef1cf4b4f32be3da1dda043c05d%7C0%7C0%7C637336399076676709&sdata=q4rL003CzkzDVNJu0moh3PpxiOK5JaXdsQ3840CurXg%3D&reserved=0.
När du skickar e-post till Karolinska Institutet (KI) innebär detta att KI kommer att behandla dina personuppgifter. Här finns information om hur KI behandlar personuppgifterhttps://ki.se/medarbetare/integritetsskyddspolicy.
Sending email to Karolinska Institutet (KI) will result in KI processing your personal data. You can read more about KI’s processing of personal data herehttps://ki.se/en/staff/data-protection-policy.
According to the diamond help page,v0.9.19 to v0.9.24 produce and accept format version 2, which of course is the only version I did not include. I will attempt to install and include that version here.
@StefPN I have updated the comment here with revised locations for all the databases. Can you use the correct version for your diamond
version and let us know if it solves your problem.
Dear Rob, Thank you very much! I now downloaded the 98 v2. When I start running SF, I do not get the DB error I got before. However, unfortunately I get the following error:
superfocus -q SUPERFOCUS/ -dir SUPERFOCUS/output/ -db DB_98 -a blast
[2020-08-24 08:25:01,838 - INFO] SUPER-FOCUS: A tool for agile functional analysis of shotgun metagenomic data
[2020-08-24 08:25:01,841 - INFO] 1.1) Working on: NC.01_R1.fastq
[2020-08-24 08:25:01,841 - INFO] Aligning sequences in NC.01_R1.fastq to 98 using blast
BLAST query error: CFastaReader: Near line 1, there's a line that doesn't look like plausible data, but it's not marked as defline or comment.
[2020-08-24 08:25:01,938 - INFO] Parsing Alignments
Traceback (most recent call last):
File "/home/stefanie.prast/.local/bin/superfocus", line 11, in `<module>`
load_entry_point('superfocus==0.0.0', 'console_scripts', 'superfocus')()
File "/home/stefanie.prast/.local/lib/python3.7/site-packages/superfocus-0.0.0-py3.7.egg/superfocus_app/superfocus.py", line 342, in main
del_alignments)
ValueError: not enough values to unpack (expected 2, got 0)
(base) [stefanie.prast@ctmr-nas ep]$ head SUPERFOCUS/NC.01_R1.fastq
@M01548:130:000000000-BBN6D:1:2106:18580:4509 1:N:0:TCCGGAGA+CCTATCCT
TAAAACCGGGAAATGGACCGATGCCCGTTCTTATCTTACAAACATGGGCATTGATAAGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCGGAGAAGCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAATGAAAAAAAAAAAGATGAGAGGCAAAAAACACAAAACATTAAAATAGAAGTGAGACATGTATAGAGAGAAGAGAGAAGAAAAGTATGAGCGGAGTAGAGACGTCAGGTGACGTAAGCTGTAGTACGATAGTTAAATTGAGTTCTAACAAGTAGAGAGAGTACTGTGA
+
B@BCCG7@@CGGFDGGCGGCG@BGE<F:<F@FF96F<@FGFDFGCGGGFGGCFFAA@C<6,CB::CDACFF@6FC7CDFCFAGEFFECCFGEEB:7=CF,4:=:CC8F<AF+8E+5,C5,CFCF<EEFCF@:++,77BF,,@ECFC***,,,,,,***,,6***,,,*,4,*,,++,,52+++++++2+5*++35959+3+/*+*+***/+*+0**++++3++**)*)**+*+***/)*)**+1+)10*)+*0*0*2)*0*(10*)*./)*-*)/*).*-))*()1))))(,(.:).-4))
@M01548:130:000000000-BBN6D:1:2106:16678:4867 1:N:0:TCCGGAGA+CCTATCCT
GATATTTTTCTCTTCAGTGATGCTGCACTGGAAAGTAACGCCGGGGAATGCTTACAACCAGCCAAGGGGATCTCGAGAATGATTCTGCCTAGGAGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCGGAGAATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAAACACACGCTACCCAACCACCCTTCTCTACGCTCTCTTTTACTTATACTGCCCTAGCCTCACACCCCCCATCTTACCTCCATTCACCTTCCTTCACCGCCCTCTCCACCACCTACATTCTACACCTCCTCA
+
CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGG,?FFGFGGGGGGFGDGGGGGG9FFGGGGGGGGG**5*>,,**4*6:,:<***6*1***5>+@+5*3*/*++5;++++++2+=+1+2*0**+*+2+*+*/)/)*)28C*+*08C5*977C*:)*)*)***)2(),),)(0)-()(((0(.).))))-)((((243.
@M01548:130:000000000-BBN6D:1:2106:28259:8664 1:N:0:TCCGGAGA+CCTATCCT
CTGGGCACCACCGGTGCAGTGAACCATACCGTGAACCTCTGGGCGGAGCTCATCGAGAATCTTCTTGATGACAGGAGCGTAGGTACGGGTAGGAGAGAGTACCAGTTCTCCGGCATTGATAGGTGAACCCTCTACCTCATCAGTCAACTTATACTTACCGCTGTAAACCAACTCCTCTGGCACGGCGTGGTCGTAGCTCTCAGGATAGTTCTCTGCGAGATACTTGGCGAATACATCGTGGCGGGCAGAAGTCAAACCGTTGCTGCCCATACCGCCGTTGTACTTCTTCTCGTAAGTAGCC
I do not see anything wrong with my fastq file. What could be the problem? Thank you very much for your help!
This is the diamond database, so change -a blast
to -a diamond
Rob
On Mon, Aug 24, 2020, 4:04 PM StefPN notifications@github.com wrote:
Dear Rob, Thank you very much! I now downloaded the 98 v2. When I start running SF, I do not get the DB error I got before. However, unfortunately I get the following error:
superfocus -q SUPERFOCUS/ -dir SUPERFOCUS/output/ -db DB_98 -a blast [2020-08-24 08:25:01,838 - INFO] SUPER-FOCUS: A tool for agile functional analysis of shotgun metagenomic data [2020-08-24 08:25:01,841 - INFO] 1.1) Working on: NC.01_R1.fastq [2020-08-24 08:25:01,841 - INFO] Aligning sequences in NC.01_R1.fastq to 98 using blast BLAST query error: CFastaReader: Near line 1, there's a line that doesn't look like plausible data, but it's not marked as defline or comment. [2020-08-24 08:25:01,938 - INFO] Parsing Alignments Traceback (most recent call last): File "/home/stefanie.prast/.local/bin/superfocus", line 11, in
<module>
load_entry_point('superfocus==0.0.0', 'console_scripts', 'superfocus')() File "/home/stefanie.prast/.local/lib/python3.7/site-packages/superfocus-0.0.0-py3.7.egg/superfocus_app/superfocus.py", line 342, in main del_alignments) ValueError: not enough values to unpack (expected 2, got 0) (base) [stefanie.prast@ctmr-nas ep]$ head SUPERFOCUS/NC.01_R1.fastq @M01548:130:000000000-BBN6D:1:2106:18580:4509 1:N:0:TCCGGAGA+CCTATCCT TAAAACCGGGAAATGGACCGATGCCCGTTCTTATCTTACAAACATGGGCATTGATAAGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCGGAGAAGCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAATGAAAAAAAAAAAGATGAGAGGCAAAAAACACAAAACATTAAAATAGAAGTGAGACATGTATAGAGAGAAGAGAGAAGAAAAGTATGAGCGGAGTAGAGACGTCAGGTGACGTAAGCTGTAGTACGATAGTTAAATTGAGTTCTAACAAGTAGAGAGAGTACTGTGA + B@BCCG7@@CGGFDGGCGGCG@BGE<F:<F@FF96F<@FGFDFGCGGGFGGCFFAA@C<6,CB::CDACFF@6FC7CDFCFAGEFFECCFGEEB:7=CF,4:=:CC8F<AF+8E+5,C5,CFCF<EEFCF@:++,77BF,,@ECFC,,,,,,,,6,,,,4,,,++,,52+++++++2+5++35959+3+/++**/++0++++3++)*)++/)*)+1+)10)+002)0(10)./)-)/).-))*()1))))(,(.:).-4)) @M01548:130:000000000-BBN6D:1:2106:16678:4867 1:N:0:TCCGGAGA+CCTATCCT GATATTTTTCTCTTCAGTGATGCTGCACTGGAAAGTAACGCCGGGGAATGCTTACAACCAGCCAAGGGGATCTCGAGAATGATTCTGCCTAGGAGATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCGGAGAATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAAACACACGCTACCCAACCACCCTTCTCTACGCTCTCTTTTACTTATACTGCCCTAGCCTCACACCCCCCATCTTACCTCCATTCACCTTCCTTCACCGCCCTCTCCACCACCTACATTCTACACCTCCTCA + CCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGEGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGGGGGGGGGGGGGGGGGGGFGGGGGGGGGGGG,?FFGFGGGGGGFGDGGGGGG9FFGGGGGGGGG5*>,,46:,:<6*1*5>+@+53/++5;++++++2+=+1+20++2++/)/))28C+08C5977C:)))*)2(),),)(0)-()(((0(.).))))-)((((243. @M01548:130:000000000-BBN6D:1:2106:28259:8664 1:N:0:TCCGGAGA+CCTATCCT CTGGGCACCACCGGTGCAGTGAACCATACCGTGAACCTCTGGGCGGAGCTCATCGAGAATCTTCTTGATGACAGGAGCGTAGGTACGGGTAGGAGAGAGTACCAGTTCTCCGGCATTGATAGGTGAACCCTCTACCTCATCAGTCAACTTATACTTACCGCTGTAAACCAACTCCTCTGGCACGGCGTGGTCGTAGCTCTCAGGATAGTTCTCTGCGAGATACTTGGCGAATACATCGTGGCGGGCAGAAGTCAAACCGTTGCTGCCCATACCGCCGTTGTACTTCTTCTCGTAAGTAGCCI do not see anything wrong with my fastq file. What could be the problem? Thank you very much for your help!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/metageni/SUPER-FOCUS/issues/66#issuecomment-678933263, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGMFB4LFKWVYTUD4X6N7ALSCICVZANCNFSM4PWIXWWA .
Of course! Sorry, I was not paying enough attention when copying the code this morning. It seems to be running now. Thank you!
Are you running two superfocus commands at the same time? I think you will need to run one and wait for it to finish and then run the other. Or run them in separate directories.
Rob
On Mon, 24 Aug 2020 at 22:10, StefPN notifications@github.com wrote:
So, while the db seems to be correct now, I get another error: `Deallocating taxonomy... [8e-06s] Total time = 724.78s Reported 29736263 pairwise alignments, 29750338 HSPs. 1632832 queries aligned. diamond v0.9.24.125 | by Benjamin Buchfink buchfink@gmail.com Licensed under the GNU GPL https://www.gnu.org/licenses/gpl.txt Check http://github.com/bbuchfink/diamond https://github.com/bbuchfink/diamond for updates.
CPU threads: 40
Loading subject IDs... No such file or directory [0.000217s] Error: Error opening file SUPERFOCUS/output/PA.01.fastq_alignments.daa rm: cannot remove ‘SUPERFOCUS/output/*.daa’: No such file or directory ` The only files created up to this step are: -rw-rw-r--. 1 stefanie.prast users 0 Aug 24 08:20 NC.01.fastq_alignments -rw-rw-r--. 1 stefanie.prast users 0 Aug 24 08:25 NC.01_R1.fastq_alignments -rw-rw-r--. 1 stefanie.prast users 1146346 Aug 24 13:55 NC.01_R1.fastq_alignments.m8 -rw-rw-r--. 1 stefanie.prast users 0 Aug 24 08:21 PA.01.fastq_alignments
Note that NC.01 was processed first, without any error reported. This is the negative control, so a very very small file. What could be the problem now?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/metageni/SUPER-FOCUS/issues/66#issuecomment-679102170, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGMFBZGJNXUNDHT4C2ZP2LSCJNUJANCNFSM4PWIXWWA .
I premade the databases for conda download.
Please check your
diamond
version withdiamond --version
and then read the diamond documentation to know which version to download. You can also find out the database version you have installed withdiamond dbinfo
.After downloading, you need to copy these to
lib/python3.8/site-packages/superfocus_app/db/static/diamond
in the same location as superfocus:e.g. for
90_clusters
:These are smaller downloads than the raw files (
db.zip
is 3.3 GB)