Qihoo360 / tensornet

Apache License 2.0
315 stars 72 forks source link

Support explicit dataset shard? #47

Open pengyuan opened 3 years ago

pengyuan commented 3 years ago

when tensorflow read dataset from kafka, how to implement each partition read by one worker?

something like hovorod: a. get the total worker number and worker index by rank() b. then apply shard() method in tensorflow

zhangys-lucky commented 3 years ago

here has a demo: examples/common/util.py

dataset = ds_data_files.shard(num_shards=tn.core.shard_num(), index=tn.core.self_shard_id())