klay-music / klay-beam

Our Apache Beam Transforms and Pipelines
1 stars 0 forks source link

Include Encodec model in `job_nac.Dockerfile` #50

Closed mxkrn closed 11 months ago

mxkrn commented 1 year ago

The workers are currently downloading the Encodec model every time at start-up, this seems a bit wasteful and can be optimized easily by including this model in the docker container.

CharlesHolbrow commented 1 year ago

I think this would be good to do, and also pretty easy. In part I would like this, because each parallel process may be redundantly downloading it's own copy.

There's an example of how to do it in the discogs effent PR if you want to get started: https://github.com/klay-music/klay-beam/pull/49/files#diff-2fcaa93543c07d3abd7165f39ebbb02c55d9ffcc3dc40c230dc048337c793f76

Otherwise I'll get to it the next time on working on this job during the v1.0 work.

My suspicion is that we should just have dedicated jobs for EnCodec and NAC for a few reasons, including that it is also wasteful to put both models in the in the docker image.

mxkrn commented 1 year ago

Yea one of us can tackle this when we get there, I think for now there's no priority since there's no upcoming Encodec extraction jobs.

My suspicion is that we should just have dedicated jobs for EnCodec and NAC for a few reasons, including that it is also wasteful to put both models in the in the docker image.

Yea I agree with this, another reason is that we're probably not going to be using DAC anytime soon. We're even considering deprecating it for now because it's difficult to wield for a number of reasons.

CharlesHolbrow commented 11 months ago

Closing in favor of: https://github.com/klay-music/klay-beam-jobs/issues/3