Closed sparkling9809 closed 1 year ago
Thanks for your question. Currently, HugeCTR supports reading Parquet data, loading and saving models from/to remote file systems like HDFS, AWS S3, and GCS. And we only support Kafka in inference to support online update of incremental models to HPS. @jershi425 Please add your comments.
Yes as @yingcanw said, currently we don't support reading/streaming data from Kafka. Kafka is only for model updating purposes. And it is recommended to use our data reader to read parquet data for training due to its better performance and convenience.
OK, thanks !
I want to read data from kafka to implement realtim trainning. But the dataReader in Hugectr just supports file now. is there any way to support read data for trainning from Kafka? Thanks.