NVIDIA-Merlin / HugeCTR

HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training
Apache License 2.0
905 stars 196 forks source link

[Question] How to dump incremental model to kafka in Release 23.12? #438

Open lausannel opened 6 months ago

lausannel commented 6 months ago

I read from the release notes about version 23.11 that

We are working on deprecating the Embedding Training Cache (ETC). If you trying using that feature, it still works but omits a deprecation warning message. In a near-futre release, they will be removed from the API and code level. Please refer to the NVIDIA HierarchicalKV as an alternative.

And in release 23.12, I find out that the embedding training cache API and model.dump_incremental_model_2kafka() API have been removed. Here are my questions

  1. I have no idea about how NVIDIA HierarchicalKV, which is currently a single GPU key-value store, can act as a replacement for the embedding training cache.
  2. How to dump incremental model updates to a Kafka message queue in the absence of the deprecated model.dump_incremental_model_2kafka() API.

Any insights are appreciated, thanks for your attention!

yingcanw commented 6 months ago

@lausannel Thank you for your feedback. We are gradually deprecating the ETC-related APIs. HKV will serve as a replacement for ETC. Regarding the use of SOK for training after integrating HKV, you can refer to the reply in this #424. In addition, we are currently developing SOK interfaces corresponding to ETC, such as dump_incremental_model_2kafka, so stay tuned.

lausannel commented 6 months ago

@yingcanw Thank you for your update!