Clearer way to compile the bloom filter and general cleanup

brainstorm commented 12 years ago

Please Enze, merge this. It would be easier for users and developers to have a simple Makefile instead of lenghty documentation.

tzcoolman commented 12 years ago

will do... @brainstorm

tzcoolman commented 12 years ago

will do... @brainstorm

tzcoolman commented 12 years ago

will do... @brainstorm

tzcoolman commented 12 years ago

will do... @brainstorm

brainstorm commented 12 years ago

No, you did not merge my pull request correctly ;)

Please, just reopen the pull request and click on "Merge this pull request automatically" green button and that should do it for you. I also recommend that you give a try to:

http://try.github.com

@arvestad Maybe you want to try it too!

arvestad commented 12 years ago

@arvestad Maybe you want to try it too!

I only read man pages.

Lasse

(and bug colleagues... :-)

brainstorm commented 12 years ago

Bug at will then, no problem @arvestad ;)

Btw, I managed to build the bloom filter, but now it segfaults when querying:

$ cat ref.bloom ~/dev/Boutonniere/tests/data/ecoli_partial.bloom

$ ./simple_check -m 1 -q ~/dev/Boutonniere/tests/data/100reads_with_1read_ecoli.fastq -l ref.bloom -t 0.8 -s 1 Mode : The argument of -m is 1 Query : The argument of -q is /bubo/home/h5/roman/dev/Boutonniere/tests/data/100reads_with_1read_ecoli.fastq Bloom list : The argument of -l is ref.bloom Tolerant rate: The argument of -t is 0.8 Sampling rate: The argument of -s is 1 distributing... offset->394 TYPE->2 Segmentation fault (core dumped)

16 aug 2012 kl. 13:25 skrev Lars Arvestad:

@arvestad Maybe you want to try it too!

I only read man pages.

Lasse

(and bug colleagues... :-) — Reply to this email directly or view it on GitHub.

'-l' means a list containing all bloom filters, which means in your case, you should create a file and type 'ref.bloom' in it.

ONE filter name per line.

brainstorm commented 12 years ago

Enze, I guess it was you editing my response instead of replying...

ref.bloom is indeed a file listing the bloom filter, as I said in the cat command:

$ cat ref.bloom ~/dev/Boutonniere/tests/data/ecoli_partial.bloom

In other words:

$ file ref.bloom ref.bloom: ASCII text $ file ecolipartial.bloom ecolipartial.bloom: data

Sure, indeed, you have a point, the extensions might be misleading:

$ mv ref.bloom bloom_filters.list

But that's not the problem:

$ ./simple_check -m 1 -q 100readsecoli.fastq -l bloom_filters.list -t 0.8 -s 1 (...) distributing... offset->394 TYPE->2 Segmentation fault (core dumped)

Please, feel free to walk up to my desk on floor 3 quadrant 2, we can have a chat and fix this if you wish.

tzcoolman commented 12 years ago

Hej Roman,

I am at home now. Can we have this chat around 3pm or later?

I guess that the problem comes from the dataset coz it is so tiny. I would like to see your dataset.

Enze

brainstorm commented 12 years ago

I'm building the bloom filter against the K12 ecoli genome:

http://www.ncbi.nlm.nih.gov/nuccore/U00096.2

Specifically, so that you can reproduce it:

$ cat reference.list /bubo/nobackup/uppnex/reference/biodata/genomes/Ecoli/eschColi_K12/seq/eschColi_K12.fasta

$ ./bloom_build -k 21 -l reference.list

Then, when it comes to querying, I simply use an arbitrary fastq file with the first read containing part of the ecoli genome sequence. The rest can be arbitrary/random reads and qualities:

$ head -4 100reads_ecoli.fastq


@HWI-ST188:2:1101:2751:1987#0/1
AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTT
+
BP\aceeefgggfhiifghiihgiiihiiiihhhhhhhfhgcgh_fegefafhhihcegbgafdbdgggceeecdd]^aWZ^Y]bba^[_b]GTXX]aOPJPS`BB

Hope that helps!

arvestad commented 12 years ago

Could you add that data to a subdirectory so one can quickly and easily run a test case?

I have made some changes to that it compiles on MacOS now. It wasn't too difficult, some ugly #ifdefs (I am sure the conditinonal compilation can be made more beautiful) and an upgrade to a more recent version of gcc (later than v4.5 I thin; I picked 4.7) was all that was needed.

But I want to test before pushing the changes.

Lasse

16 aug 2012 kl. 14.31 skrev Roman Valls:

I'm building the bloom filter against the K12 ecoli genome:

http://www.ncbi.nlm.nih.gov/nuccore/U00096.2

Specifically, so that you can reproduce it:

$ cat reference.list /bubo/nobackup/uppnex/reference/biodata/genomes/Ecoli/eschColi_K12/seq/eschColi_K12.fasta

$ ./bloom_build -k 21 -l reference.list

Then, when it comes to querying, I simply use an arbitrary fastq file with the first read containing part of the ecoli genome sequence. The rest can be arbitrary/random reads and qualities:

$ head -4 100reads_ecoli.fastq

@HWI-ST188:2:1101:2751:1987#0/1 AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTT + BP\aceeefgggfhiifghiihgiiihiiiihhhhhhhfhgcgh_fegefafhhihcegbgafdbdgggceeecdd]^aWZ^Y]bba^[_b]GTXX]aOPJPS`BB

Hope that helps!

— Reply to this email directly or view it on GitHub.

arvestad commented 12 years ago

BTW, when I push those changes there will be changes all over the files because I have indented the code here and there. Enze, it is just not good enough not to indent your code!

Lasse

16 aug 2012 kl. 14.31 skrev Roman Valls:

I'm building the bloom filter against the K12 ecoli genome:

http://www.ncbi.nlm.nih.gov/nuccore/U00096.2

Specifically, so that you can reproduce it:

$ cat reference.list /bubo/nobackup/uppnex/reference/biodata/genomes/Ecoli/eschColi_K12/seq/eschColi_K12.fasta

$ ./bloom_build -k 21 -l reference.list

Then, when it comes to querying, I simply use an arbitrary fastq file with the first read containing part of the ecoli genome sequence. The rest can be arbitrary/random reads and qualities:

$ head -4 100reads_ecoli.fastq

@HWI-ST188:2:1101:2751:1987#0/1 AGCTTTTCATTCTGACTGCAACGGGCAATATGTCTCTGTGTGGATTAAAAAAAGAGTGTCTGATAGCAGCTTCTGAACTGGTTACCTGCCGTGAGTAAATTAAAATTT + BP\aceeefgggfhiifghiihgiiihiiiihhhhhhhfhgcgh_fegefafhhihcegbgafdbdgggceeecdd]^aWZ^Y]bba^[_b]GTXX]aOPJPS`BB

Hope that helps!

— Reply to this email directly or view it on GitHub.

tzcoolman commented 12 years ago

Sorry Lars. I forgot to write it more beautifully and use indent tool. In fact, there is a very simple indent tool. Just use command "indent target.c" in any linux environment. I used to use that.

tzcoolman / FACS-OLD

Clearer way to compile the bloom filter and general cleanup #1