Closed drhoads closed 8 months ago
Hi @drhoads,
What version of BBT are you running?
Just installed using conda this AM:
(MultiQC) @.***:/mnt/f/DNAwork/Scohnii/genomes/DoNotUse/biobloom$ biobloomcategorizer --version biobloomcategorizer (BIOBLOOMTOOLS) 2.3.5-1-gfa70-dirty Written by Justin Chu.
Copyright 2013 Canada's Michael Smith Genome Science Centre
@.***
From: Lauren Coombe @.> Sent: Monday, March 4, 2024 9:41 AM To: bcgsc/biobloom @.> Cc: Douglas Duane Rhoads @.>; Mention @.> Subject: Re: [bcgsc/biobloom] Need Filter File (Issue #87)
Hi @drhoadshttps://github.com/drhoads,
What version of BBT are you running?
- Reply to this email directly, view it on GitHubhttps://github.com/bcgsc/biobloom/issues/87#issuecomment-1976871391, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIX22VT2QGFJZI2UHYVZJOTYWSIYXAVCNFSM6AAAAABEFGV5YWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZWHA3TCMZZGE. You are receiving this because you were mentioned.Message ID: @.**@.>>
Ok great!
Could you do one small test for me - could you test to see if BBT recognizes your files if you just specify one Bloom filter? (ie. then don't have to use quotes).
Ex:
biobloomcategorizer -e -p 1637 –f 1638filter.bf ../1637_S141_R1_001_ptrim.fq ../1637_S141_R2_001_ptrim.fq
Sorry was away for 5 hours, and just tried it with just one filter and no quotes. Same issue.
(MultiQC) @.***:/mnt/f/DNAwork/Scohnii/genomes/DoNotUse/biobloom$ biobloomcategorizer -e -p 1637 -f 1638filter.bf ../1637_S141_R1_001_ptrim.fq ../1637_S141_R2_001_ptrim.fq Usage of paired end mode: BioBloomCategorizer [OPTION]... -f "[FILTER1]..." [FILEPAIR1] [FILEPAIR2] or BioBloomCategorizer [OPTION]... -f "[FILTER1]..." [SMARTPAIR]
Error: Need Filter File (-f) Try '--help' for more information.
FYI the bf file is 3.16 Mb @.***
From: Douglas Duane Rhoads Sent: Monday, March 4, 2024 9:45 AM To: bcgsc/biobloom @.>; bcgsc/biobloom @.> Cc: Mention @.***> Subject: RE: [bcgsc/biobloom] Need Filter File (Issue #87)
Just installed using conda this AM:
(MultiQC) @.***:/mnt/f/DNAwork/Scohnii/genomes/DoNotUse/biobloom$ biobloomcategorizer --version biobloomcategorizer (BIOBLOOMTOOLS) 2.3.5-1-gfa70-dirty Written by Justin Chu.
Copyright 2013 Canada's Michael Smith Genome Science Centre
@.***
From: Lauren Coombe @.**@.>> Sent: Monday, March 4, 2024 9:41 AM To: bcgsc/biobloom @.**@.>> Cc: Douglas Duane Rhoads @.**@.>>; Mention @.**@.>> Subject: Re: [bcgsc/biobloom] Need Filter File (Issue #87)
Hi @drhoadshttps://github.com/drhoads,
What version of BBT are you running?
- Reply to this email directly, view it on GitHubhttps://github.com/bcgsc/biobloom/issues/87#issuecomment-1976871391, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIX22VT2QGFJZI2UHYVZJOTYWSIYXAVCNFSM6AAAAABEFGV5YWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZWHA3TCMZZGE. You are receiving this because you were mentioned.Message ID: @.**@.>>
Hi @drhoads,
Thanks for the update! That is very strange, indeed..
So, this section of code seems to be triggered, despite you specifying a valid path to a BF: https://github.com/bcgsc/biobloom/blob/master/BioBloomCategorizer/BioBloomCategorizer.cpp#L382-L385
Would you mind sharing your full standard error and standard out for the biobloommaker commands, as well as the contents of the *txt files that were generated along with the Bloom filters? I just want to double check that those steps completed as expected.
(MultiQC) drhoads@ARSC-A-G4LJXP3:/mnt/f/DNAwork/Scohnii/genomes/DoNotUse$ biobloommaker -p 1638filter -o biobloom ../StrainNameFas/1638.fna Opening File ../StrainNameFas/1638.fna Allocating 26055104 bits of space for filter and will output filter this size (plus header) Approximated (due to false positives) total unique k-mers in reference files 2567799 Writing a 3256888 byte filter to biobloom/1638filter.bf on disk. Filter Creation Complete. 1638filter.zip
Thanks for all that info - I can confirm that on my end, if I use that exact filter, it loads the Bloom filter fine:
(btl) [lcoombe@hpce705 tmp]$ biobloomcategorizer -f 1638filter.bf ../DRR021766_1.fastq.gz
Min score threshold: 0.15
Starting to Load Filters.
Loaded Filter: 1638filter
Filter Loading Complete.
So, I'm wondering if it is something related to WSL2Ubuntu that BBT is having an issue with..
@jwcodee / @JustinChu / @parham-k - Do you have any ideas as to why the constructed BFs are not being recognized properly?
Could it be that I am installing in a conda env for MultiQC? I came across BBT in the listing of tools for use in MultiQC, and this is my first foray with MultiQC. In the AM I will try installing BBT in its own env and try again. Will report back what I find out.
Get Outlook for Androidhttps://aka.ms/AAb9ysg
From: Lauren Coombe @.> Sent: Monday, March 4, 2024 3:56:52 PM To: bcgsc/biobloom @.> Cc: Douglas Duane Rhoads @.>; Mention @.> Subject: Re: [bcgsc/biobloom] Need Filter File (Issue #87)
Thanks for all that info - I can confirm that on my end, if I use that exact filter, it loads the Bloom filter fine:
(btl) @.*** tmp]$ biobloomcategorizer -f 1638filter.bf ../DRR021766_1.fastq.gz Min score threshold: 0.15 Starting to Load Filters. Loaded Filter: 1638filter Filter Loading Complete.
So, I'm wondering if it is something related to WSL2Ubuntu that BBT is having an issue with..
@jwcodeehttps://github.com/jwcodee / @JustinChuhttps://github.com/JustinChu / @parham-khttps://github.com/parham-k - Do you have any ideas as to why the constructed BFs are not being recognized properly?
— Reply to this email directly, view it on GitHubhttps://github.com/bcgsc/biobloom/issues/87#issuecomment-1977531776, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AIX22VXDR7QFNLSGNT6NQ5TYWTU2JAVCNFSM6AAAAABEFGV5YWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZXGUZTCNZXGY. You are receiving this because you were mentioned.Message ID: @.***>
Tried installing biobloomtools in its own conda env and the error happened the same:
(BioBloomTool) drhoads@ARSC-A-G4LJXP3:/mnt/f/DNAwork/Scohnii/genomes/DoNotUse/biobloom$ biobloomcategorizer -e -p 1637 –f 1638filter.bf ../1637_S141_R1_001_ptrim.fq ../1637_S141_R2_001_ptrim.fq Usage of paired end mode: BioBloomCategorizer [OPTION]... -f "[FILTER1]..." [FILEPAIR1] [FILEPAIR2] or BioBloomCategorizer [OPTION]... -f "[FILTER1]..." [SMARTPAIR]
Error: Need Filter File (-f) Try '--help' for more information.
I was able to install and run all the commands on our HPC system, so what ever it is must either relate to WSL2 or Ubuntu. I have another machine in my lab with WSL2-Ubuntu, and a LinuxMint machine. If I get a chance on Thursday I will see what happens on them. Sure is strange, and I keep looking for a mistyped word or something. Usually it is some small insidious thing. For me it is usually a / vs \ since I move between Windows and Linux, or single vs double hyphen.
Oh great, thank you for that update! My guess is that it's the WSL2 environment - we have seen that before with other tools that we get can rather cryptic errors without straightforward solutions. We generally work on Centos machines, but have run BBT on ubuntu before without an issue. Probably, if possible, working off the HPC system will be your best bet! The WSL2 things are rather hard for us to troubleshoot, since we don't have access to that environment!
Hey this is a bit silly but the original command:
biobloomcategorizer -e -p biobloom/1637 –f "biobloom/1638filter.bf biobloom/1715filter.bf" 1637_S141_R1_001_ptrim.fq 1637_S141_R2_001_ptrim.fq
is using f
but is specific with the character –
which is 150 in ascii rather than -
which is 45. I'm not sure how this happened for you but I know for a fact Ubuntu doesn't autoconvert –
to -
.
Can you double check this is working:
biobloomcategorizer -e -p biobloom/1637 -f "biobloom/1638filter.bf biobloom/1715filter.bf" 1637_S141_R1_001_ptrim.fq 1637_S141_R2_001_ptrim.fq
Huh good catch - thanks for noticing that @JustinChu!
Yep, just got back from the lab where I installed on my other WSL2Ubuntu and it ran fine but I had to type in the command. On my home machine I had been copy-pasting from a journal I keep of all my work. I deleted the -f and typed it in and it ran just fine. If you are wondering where I got the chr(150) it was from your github instructions because I used the copy command and then modified to suit my needs. To confirm this I went back to your github page (https://github.com/bcgsc/biobloom) under section 3. where it says: There are some advanced options open can use outlined in section 5. Notable option one can use is the paired end mode -e:
./biobloomcategorizer -e –p /output/prefix –f "filter1.bf filter2.bf filter3.bf" inputReads1_1.fq inputreads1_2.fq -e will require that both reads match when making the call about what reference they belong in.
Then I copied the "-f" from that command and put it into my command that just worked, and it reverted to throwing the error. So, you might want to check that website and that impostor hyphen. On a delightful note the filtering cleaned out the contaminant in my NGS data and I got a great assembly. Thanks for a great tool that will be part of my arsenal from here on out.
Oh yeah that's a good point I noticed we have some of those issues in the readme. I'm honestly not sure how that happened. I'll replace them all. Thanks.
Need Filter File (-f)
Working in WSL2Ubuntu trying to filter paired end data from bacterial genome where there is a contaminant. I have two reference genomes 1638 and 1715, with the contaminated being 1637. I used biobloommaker to create 2 .bf:
biobloommaker -p 1638filter -o biobloom ../StrainNameFas/1638.fna biobloommaker -p 1715filter -o biobloom ../StrainNameFas/1715.fna
then when I run the categorizer I get an error biobloomcategorizer -e -p biobloom/1637 –f "biobloom/1638filter.bf biobloom/1715filter.bf" 1637_S141_R1_001_ptrim.fq 1637_S141_R2_001_ptrim.fq
**Usage of paired end mode: BioBloomCategorizer [OPTION]... -f "[FILTER1]..." [FILEPAIR1] [FILEPAIR2] or BioBloomCategorizer [OPTION]... -f "[FILTER1]..." [SMARTPAIR]
Error: Need Filter File (-f) Try '--help' for more information.**
Can't figure out why it is not reading the filter files. I have even tried running from inside the biobloom folder:
drhoads@ARSC-A-G4LJXP3:/mnt/f/DNAwork/Scohnii/genomes/DoNotUse/biobloom$ biobloomcategorizer -e -p 1637 –f "16 38filter.bf 1715filter.bf" ../1637_S141_R1_001_ptrim.fq ../1637_S141_R2_001_ptrim.fq Usage of paired end mode: BioBloomCategorizer [OPTION]... -f "[FILTER1]..." [FILEPAIR1] [FILEPAIR2] or BioBloomCategorizer [OPTION]... -f "[FILTER1]..." [SMARTPAIR]
Error: Need Filter File (-f) Try '--help' for more information. (MultiQC) drhoads@ARSC-A-G4LJXP3:/mnt/f/DNAwork/Scohnii/genomes/DoNotUse/biobloom$
The two files are definitely in the biobloom folder along with their .txt files