Closed dddraxxx closed 4 months ago
Thanks for your attention. We have now fixed this bug. You can try it again. You can start by using a small amount of data to test if it works well, to avoid the long time consumption of tokenizing data. We will also support lazy mode data processing to avoid lengthy tokenization before training.
Thanks a lot! The loading time for data is indeed causing much time. Thanks for your continuous optimization of the training code!
Hi, thanks for your excellent work! When I try to train using command
I got the error
The self.projector.model is shown as here: Naive_Proj( (query_proj): Linear(in_features=512, out_features=6144, bias=True) (model): Sequential( (0): Linear(in_features=6144, out_features=4096, bias=True) (1): GELUActivation() (2): Linear(in_features=4096, out_features=4096, bias=True) ) (model_feat): Sequential( (0): Linear(in_features=6656, out_features=4096, bias=True) (1): GELUActivation() (2): Linear(in_features=4096, out_features=4096, bias=True) ) (seperate_embed): Embedding(1, 4096) ), is there any fix for the bug?