COMBINE-lab / radtk

Various tools for working with RAD files
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

subsampling and spliting rad file #7

Open wangjiawen2013 opened 4 months ago

wangjiawen2013 commented 4 months ago

it will be useful to support subsampling and spliting rad files with radtk.

rob-p commented 4 months ago

The splitting operation is pretty clear to me, but can you elaborate a bit more about what you’d expect from a subsampling command? For example, what would the parameters be, what would the expected input and output be?

wangjiawen2013 commented 4 months ago

You can make it by imitating samtools. Please refer to this post: https://www.biostars.org/p/76791/ https://bioinformatics.stackexchange.com/questions/402/how-can-i-downsample-a-bam-file-while-keeping-both-reads-in-pairs The input is a rad and the output is a smaller rad file subsampled randomly from the origin rad. It's better to set all the parameters the same as samtools for users from samtools to use it seamlessly.

rob-p commented 4 months ago

I now have a prototype implementation of split on the dev branch. I will probably polish this a bit and cut another release, and then start working on sampling afterward.

wangjiawen2013 commented 4 months ago

Looking forward to it!

rob-p commented 4 months ago

Hi @wangjiawen2013,

Great. I just cut the release of 0.2.0 with the split command. I’ll ping back here again when I have an implementation of sampling.

Best, Rob