Prepare reads with just --reads_name

biobenkj commented 8 years ago

Is it possible to generate the baga.CollectData.Reads-your_read_group_name.baga without having to specify the --subsample_to_cov option? e.g. baga/baga_cli.py PrepareReads --reads_name your_read_group_name

It would be nice to make full use of the reads instead of subsampling. Thanks for your time!

daveuu commented 8 years ago

Yes:

baga_cli.py PrepareReads --adaptors fullsample --reads_name <myreads>

The default option for --adaptors is actually subsample which uses the output --subsample_to_cov produces. The above command goes to the original read files given to CollectData. Some documentation for the different command line tasks is available via:

baga_cli.py PrepareReads -h

Similarly, if your reads are ready for analysis and you don't need any of the PrepareReads options you can skip it with --prepared in the alignment step, e.g.:

baga/baga_cli.py AlignReads \
--reads_name Liverpool \
--genome_name NC_011770.1 \
--prepared \
--align --deduplicate

Cheers! I hope it comes in useful.

Development is on going. One particular limitation worth knowing about is BAGA doesn't handle multi-replicon genomes i.e., one or more chromosomes and/or plasmids. But work is underway for that - let me know if you'd need that sooner than later.

biobenkj commented 8 years ago

:+1: Thanks for the quick reply and this is definitely something the lab will make use of. I am currently working with a multi-replicon genome (1 chr and 1 plasmid). However, I know there are no variants on the plasmid. Working through the data now on the AWS EC2. Thanks again. Also, fine to close if you'd like.

daveuu / baga

Prepare reads with just --reads_name #7