tzcoolman / FACS-OLD

0 stars 2 forks source link

Command line parameters #14

Closed henrikstranneheim closed 11 years ago

henrikstranneheim commented 12 years ago

Hi,

  1. I noticed that the command line parameters that are used by bloom_build is not the same as are reported when using -help. This makes it really hard to use/understand the program. Whenever changes are made to the listed parameters the -help documentation also needs to be updated.
  2. When using bloom_build with -help you get this USAGE info. As I understand it simple_check is the contamination script and has nothing to do with bloom_build, which builds the filters from references. This text needs to be updated with correct flags and script name.

USAGE

For Bloom build:

./simple_check -m 1 -k 21 -e 0.005 -l file_list -p test/

Arguments:

-m mode (default 1)

1--> build one filter for each file in the list

2--> build one filter for every file in the list

-k k_mer size (default 1)

-e error rate (default 0.0005)

-l list containing all references files. One file per line

-p prefix (default same path as the script)

  1. The format for the reference list must be described (path or just file name, separators etc)
  2. A default size of 1 for -k is nonsensical. I would not go lower than 10 and that is for really small genomes (<16kb).
  3. The -o parameter in bloom_build wants a file name (whole path) but when I want to build several bloom_filters using a reference_list, that does not make sense. It would be better for -m 1 to ask for a directory and use the name of the reference but change it from for example: .fasta to .bloom.
tzcoolman commented 12 years ago

Hej Henrik,

Please 'git pull' it first. The reason that you still get this stuff is that you are using the old binary file.

You have to 'make clean' and 'make all' after pulling.

tzcoolman commented 12 years ago

And be advised... The new binary file doesn't support bloom file list any more. Roman thought we should keep it simple and use shell script to do the multiple processing job. But we can discuss a little bit if you want. I still keep the old list supporting version

henrikstranneheim commented 12 years ago

[henriks@kalkyl1 DRASS]$ make clean rm -f core.* _.o bloom_build simple_check simple_remove [henriks@kalkyl1 DRASS]$ git pull Already up-to-date. [henriks@kalkyl1 DRASS]$ make all Make sure you have MPI support on your cluster hint: module load openmpi mpicc -c .c -O3 -D_FILE_OFFSET_BITS=64 -D_LARGE_FILE -fopenmp bloom.c: I funktion "help": bloom.c:562: fel: expected ";" before "ptions" bloom.c:562: fel: överblivet "\" i program bloom.c:562:10: varning: avslutande "-tecken saknas bloom.c:562: fel: avslutande "-tecken saknas bloom.c: I funktion "build_help": bloom.c:596: fel: expected ";" before "printf" make: _* [all] Fel 1

Henrik Stranneheim Ph.D. Department of Molecular Medicine and Surgery Karolinska Institute Science for Life Laboratory, KISP SE-171 65 Solna, Sweden

E-mail: henrik.stranneheim@scilifelab.se Phone: +46 (0)8 524 81487 (Office) Phone: +46 (0)736251487 (Mobile) Visiting address: Tometebodavägen 23A

On 7 sep 2012, at 10:09, tzcoolman wrote:

Hej Henrik,

Please 'git pull' it first. The reason that you still get this stuff is that you are using the old binary file.

You have to 'make clean' and 'make all' after pulling.

— Reply to this email directly or view it on GitHub.

henrikstranneheim commented 12 years ago

That will make it a lot harder to use for the general public.

Not everyone knows shell-scripting but anyone can compile a list if the format is well described.

Henrik Stranneheim Ph.D. Department of Molecular Medicine and Surgery Karolinska Institute Science for Life Laboratory, KISP SE-171 65 Solna, Sweden

E-mail: henrik.stranneheim@scilifelab.se Phone: +46 (0)8 524 81487 (Office) Phone: +46 (0)736251487 (Mobile) Visiting address: Tometebodavägen 23A

On 7 sep 2012, at 10:13, tzcoolman wrote:

And be advised... The new binary file doesn't support bloom file list any more. Roman thought we should keep it simple and use shell script to do the multiple processing job. But we can discuss a little bit if you want. I still keep the old list supporting version

— Reply to this email directly or view it on GitHub.

tzcoolman commented 12 years ago

Must be some errors on GITHUB. I updated something yesterday and GITHUB didnt get it. I have pushed again.. Plz go pull.

tzcoolman commented 12 years ago

:-/
everybody got their point. Maybe I should add a duplicated list supported version?

Let me talk to lars 1st.

tzcoolman commented 12 years ago

Shit.... Happened again??

I ll come to the lab when I finish FIFO supporting.

Enze

henrikstranneheim commented 12 years ago

Alright. I got the updated version working.

henrikstranneheim commented 12 years ago

There is no flag for supplying the query data set in simple_remove.

henrikstranneheim commented 12 years ago

In simple_remove: If you supply: -o analysis/henriks/Sim_HC/21_0.005/Homo_sapiens_51511750_NC_000021.7_chr21

you get match->analysis/henriks/Sim_HC/21_0.005/Homo_sapiens_51511750_NC_000021.7_chr21SimHC-454.7b8911bf_contam_Homo_sapiens_51511750_NC_000021.7_chr21.fasta mis->analysis/henriks/Sim_HC/21_0.005/Homo_sapiens_51511750_NC_000021.7_chr21SimHC-454.7b8911bf_clean_Homo_sapiens_51511750_NC_000021.7_chr21.fasta

whcih does not look very nice.

tzcoolman commented 12 years ago

Okay... I still have the list supported version. So if everyone all agrees with the list supported feature, I can reverse my program a little bit. So dont worry. And For the 'ugly' match/mis filename, I can change it if you have any good idea.

Enze

arvestad commented 12 years ago

I think that the default behavior should be to take input files on the command line, this is standard Unix practice, and it would be nice to have Henrik's input list (file containing filenames, one per line?) as alternative input feature. Internally, it is not hard to move command line filenames OR filenames read from a file into a simple list which is then iterated over.

Lasse

On Sep 7, 2012, at 13:59 , tzcoolman wrote:

Okay... I still have the list supported version. So if everyone all agrees with the list supported feature, I can reverse my program a little bit. So dont worry. And For the 'ugly' match/mis filename, I can change it if you have any good idea.

Enze

— Reply to this email directly or view it on GitHub.

henrikstranneheim commented 12 years ago

I think we should support file, dir , or list as input. then it offers a lot of flexibility and its up to the user what to use.

henrikstranneheim commented 12 years ago
  1. Document if the user should supply whole path or just filename. If the user supplies a filename, add a prefix or suffix.

Example: -o analysis/henriks/Sim_HC/21_0.005/Homo_sapiens_51511750_NC_000021.7_chr21

yields: analysis/henriks/Sim_HC/21_0.005/Homo_sapiens_51511750_NC_000021.7_chr21_SimHC-454.7b8911bf_clean.fasta (it should be the same .ending as the query data set infile ending)

tzcoolman commented 12 years ago

Okay...I ll try to make it both file supported and list supported.

henrikstranneheim commented 12 years ago

Excellent!

brainstorm commented 11 years ago

This issue should be closed in favor of pull request #29. I cannot close it, Enze, could you verify that the requested functionality is in the pull request and close it?

Thanks!