Closed fengelniederhammer closed 2 days ago
Thanks, this is good, I see what you mean now, the structure is clear at the expense of more layering/intermediation.
I have no objections to merging this in if it works and the tests of #3232 are adapted to work with this but I don't have the bandwidth to do this myself right now.
There might be a slight perf hit to doing it this way here as opposed to the original due to extra copying but probably negligible.
Ah this is actually super easy to unit test now, easier than before - one extra advantage of this new organization
@fengelniederhammer I've adapted the unit tests and made a function name more precise (it also filters out extra fields that are not in the schema but in the metadata) - how does this look to you? Happy for you to merge this in.
I added a commit to use Hamcrest matchers, because they provide better assertion errors than assertTrue
. Looks good to me :+1:
There might be a slight perf hit to doing it this way here as opposed to the original due to extra copying but probably negligible.
According to the docs, copy
does not copy in the sense of "copying memory", so I think there is actually no performance impact. (The method name is misleading if you're used to Rust or C++)
Use the copy() function to copy an object, allowing you to alter some of its properties while keeping the rest unchanged.
Very nice, thanks! I see, makes sense that it's a CreateOnWrite (cow) under the hood or something like that.
Summary
https://github.com/loculus-project/loculus/pull/3232#issuecomment-2485795043
It's missing tests and probably some thought over how to embed this into the rest of the code (e.g. the
CompressionService
is still used separately), but this sketches how we could separate the concerns (metadata postprocessing vs. sequence compression). Separation of concerns is also my main motivation for this.Screenshot
PR Checklist