DasLab / big_library_design

MIT License
0 stars 0 forks source link

For long barcode, use choose random instead of get all #20

Open rkretsch opened 1 year ago

rkretsch commented 1 year ago

For barcodes >10 can take a while to generate all barcodes, depending on speed, memory needs etc, would be nice to add an option to instead of generating all possible barcodes up front, each time generate a random barcode.

rkretsch commented 1 year ago

Potentially, this code could integrate the now added functionality of minimum edit distance such that it not only samples a new barcode but guarantees new barcode if far enough from all existing barcodes.

rkretsch commented 1 year ago

Particularly with very longer barcodes, could also propose going barcodes sequentially instead of randomly (would not have to check edit_distance over full set), and randomizing the order of sequences instead. This will speed up the process for long barcodes, as well as enable easier parrallization.

rkretsch commented 1 year ago

This is for none parralized speed-up so removing for the 1M goal which will be run in parralel