augur merge is stupidly slow for tiny datasets, e.g. a couple seconds. That's due to Augur's own slow startup time and having to wait for that 2+n times, where n is the number of metadata tables being joined. On large datasets, this fixed startup time shouldn't matter, but on small datasets it feels really dumb. Cutting out the additional startup times by cutting out the use of augur read-file and augur write-file makes it quite quick, as it should be. However, augur {read,write}-file are important for proper and robust handling of newlines and compression formats and can't be jettisoned without significant additional work. More to the point, we don't have to do that work (and take on the additional complexity) if we make other improvements.
based on my comment on the initial
augur merge
PRaugur merge
is stupidly slow for tiny datasets, e.g. a couple seconds. That's due to Augur's own slow startup time and having to wait for that 2+n times, where n is the number of metadata tables being joined. On large datasets, this fixed startup time shouldn't matter, but on small datasets it feels really dumb. Cutting out the additional startup times by cutting out the use ofaugur read-file
andaugur write-file
makes it quite quick, as it should be. However,augur {read,write}-file
are important for proper and robust handling of newlines and compression formats and can't be jettisoned without significant additional work. More to the point, we don't have to do that work (and take on the additional complexity) if we make other improvements.Improvements we can/should make: