In extract_random_seqs_from_genome(), It would be helpful to have an option that allows users to decide whether to exclude sequences with too many Ns (e.g. N > 0 or N > 10%). For me, it would be fine for this filtering step to happen after X sequences are drawn (e.g. if 100 sequences are drawn, then 10 are excluded because they have too many Ns, resulting in 90 sequences). It would be great to have a short printout at the end that says how many sequences were drawn and how many were filtered out due to an issue with Ns.
In
extract_random_seqs_from_genome()
, It would be helpful to have an option that allows users to decide whether to exclude sequences with too many Ns (e.g. N > 0 or N > 10%). For me, it would be fine for this filtering step to happen after X sequences are drawn (e.g. if 100 sequences are drawn, then 10 are excluded because they have too many Ns, resulting in 90 sequences). It would be great to have a short printout at the end that says how many sequences were drawn and how many were filtered out due to an issue with Ns.