YoungSeng / DiffuseStyleGesture

DiffuseStyleGesture: Stylized Audio-Driven Co-Speech Gesture Generation with Diffusion Models (IJCAI 2023) | The DiffuseStyleGesture+ entry to the GENEA Challenge 2023 (ICMI 2023, Reproducibility Award)
MIT License
156 stars 21 forks source link

关于ZEGGS数据到Lmdb的问题? #44

Open zjyzjy361 opened 1 month ago

zjyzjy361 commented 1 month ago

我在训练前的数据处理阶段遇到了问题,在zeggs_data_to_lmdb.py中make_lmdb_gesture_subdataset,发现clips无法实现序列化,报错如下: Traceback (most recent call last): File "zeggs_data_to_lmdb.py", line 184, in make_lmdb_gesture_dataset(target) File "zeggs_data_to_lmdb.py", line 110, in make_lmdb_gesture_dataset make_lmdb_gesture_subdataset(train_path, lmdb_name) File "zeggs_data_to_lmdb.py", line 95, in make_lmdb_gesture_subdataset v = pyarrow.serialize(clips[i]).to_buffer() File "pyarrow\serialization.pxi", line 358, in pyarrow.lib.serialize File "pyarrow\error.pxi", line 95, in pyarrow.lib.check_status pyarrow.lib.ArrowTypeError: Did not pass numpy.dtype object 并且在后续debug的过程中,对poses、audio_raw、mfcc_raw 、 style单独进行序列化都会出现无法序列化: poses serialization error: Did not pass numpy.dtype object audio_raw serialization error: Did not pass numpy.dtype object mfcc_raw serialization error: Did not pass numpy.dtype object style_raw serialization error: Did not pass numpy.dtype object 并且他们的数据类型都是数组float64,请问这个问题怎么解决呢?是原始数据的问题还是什么呢?zeggs的数据是在清华云中下载的

YoungSeng commented 3 weeks ago

这个问题我之前也遇到过,可以检查一下numpy版本,因为DiffuseStyleGesture使用的lmdb保存的,所以pyarrow和numpy版本很容易冲突,可以尝试一下DiffuseStyleGesture+,使用h5保存的,没有此类问题