Right now it takes more than 200 G memory to run the sightglasspostprocessing in generate_metadata.
Which cause it buggy and painful to run. And in the long term the data size will grow in O(n) if we load the whole
thing into memory and do editorial.
Which is not nessasray.
How?
Probably by the lazy load offer from polars.
Should probably need:
Prune the logic to a MVP protocal, then rewrite the indexing/load logic, add the feature back.
Restriction:
Don't import huge change and make the tech stack shift.
When:
Before next release
Target:
hopefully next release we could use a normal PC (~100G Ram) to finish the work
Why?
Right now it takes more than 200 G memory to run the sightglasspostprocessing in generate_metadata. Which cause it buggy and painful to run. And in the long term the data size will grow in O(n) if we load the whole thing into memory and do editorial.
Which is not nessasray.
How? Probably by the lazy load offer from polars. Should probably need: Prune the logic to a MVP protocal, then rewrite the indexing/load logic, add the feature back.
Restriction:
When: Before next release
Target: hopefully next release we could use a normal PC (~100G Ram) to finish the work