pangeo-forge / C-iTRACE-feedstock

Pangeo Forge feedstock for C-iTRACE.
https://pangeo-forge.org
Apache License 2.0
0 stars 0 forks source link

NetCDF3 eager loading issue overruns worker memory #3

Open cisaacstern opened 1 year ago

cisaacstern commented 1 year ago

@jordanplanders and I have been working on this for a while in https://github.com/pangeo-forge/staged-recipes/pull/176.

This issue remains unresolved and will likely prevent production deployments of this dataset from succeeding.

Aside from the possibility of dramatically increasing worker memory, the best solution I can see is to resolve https://github.com/pangeo-forge/pangeo-forge-recipes/issues/361 via https://github.com/pangeo-forge/pangeo-forge-recipes/pull/383.

jordanplanders commented 1 year ago

@cisaacstern let me know if I have an action item :) This is an increasingly exciting dataset to have available. Is there something I should have considered in terms of target_chunks that would have helped?

cisaacstern commented 1 year ago

@jordanplanders thanks for checking in. I believe this is blocked on our end at the moment. To recap, the solution is either:

  1. https://github.com/pangeo-forge/pangeo-forge-recipes/pull/383 - @rabernat, is anything blocking this?
  2. Dramatically increase worker memory so cloud workers can just open the source data eagerly. However, we do not yet have a manageable way to change worker memory on a per-recipe (or per-feedstock) basis. I started a draft which works towards this, https://github.com/pangeo-forge/pangeo-forge-orchestrator/pull/129, but that PR is in a very early stage of brainstorming and I will realistically not be able to finish it for some time.