Make the released sequence QC summary stats available

@alimanfoo @ahernank Regarding testing of the sequence QC summary stats data, which contain columns that contain contig identifiers, e.g. mean_cov_2L, which not only contain an uppercase letter, controversially, but also have a column order which is not the same as fixture.config['CONTIGS'].

I'm aware that insertion order in dictionary keys is guaranteed since Python 3.7, but I'm wondering whether the intention here was to actually assert that the order remained expected, rather than merely the set of columns.

def validate_metadata(df, expected_columns):
    # Check column names.
    expected_column_names = list(expected_columns.keys())
    assert df.columns.to_list() == expected_column_names

I was tempted to convert dicts like expected_columns to OrderedDict just to be explicit, but they would all need to be redefined as an array of tuples.

Are we aware of any reason that we need to check the specific order of columns in our sample metadata files? Would it be safe to change this test to compare sets rather than lists, and allow different column orders?

malariagen / malariagen-data-python

Make the released sequence QC summary stats available #547