Open HenrikBengtsson opened 8 years ago
For the record, I brought this up on Bioconductor support site in March 2016; https://support.bioconductor.org/p/79456/ where Herve replied saying it was a useful idea and that it would make sense to add this to Biostrings.
Similar to a FASTA index file that reports sequence name and sequence length per sequence, create a enhanced FASTA summary format that also reports on the MD5 checksum per sequence. It should do so on uppercase nucleotide letters.
This will make it easier to compare FASTA file.
Example