using region.max_len for format and check is a problem for nanopore reads

It's unclear what the max length for a nanopore seqspec read should be.

I tried using 2 million as that's what the reported, however this causes problems for seqspec check, format, and reasonably sized yaml files.

Currently the checks require the sequence lengths match the max length as a fixed string, needless to say with a 2 million basepair max_len this leads to 4 megabyte yaml file.

Some options might be to use the min length, for sequence lengths, or to implement some kind of run length encoding for the sequence strings.... instead of a massive list of Xs. Perhaps the sequence string could do something like: X{2000000} instead.

pachterlab / seqspec

using region.max_len for format and check is a problem for nanopore reads #47