Closed bogden1 closed 2 years ago
Not sure that this was really a lexical sort -- I suppose that it is something to do with what happens inside the Zooniverse reconcilers. Maybe sorting by subject id, for example.
In any case, joined.csv now sorts by volume and page number. We do not try to sort row, but given the nature of the workflows, rows should anyway be in the correct order (and even if they were not, it is not the end of the world, as the volume and page will be correct). We use a stable sort to preserve that correct row order.
Update: strip_processed.py might be the culprit for the previous peculiar sort order.
joined.csv is output in a lexical sort order by volume and page number. This means e.g. that page 100 of a give volume is output ahead of page 1. This is a minor annoyance, but it is a little confusing and is easy to fix.