thu-coai / MiniPLM

MIT License
19 stars 4 forks source link

about chunk_num_per_shard #1

Open snowkcon opened 2 weeks ago

snowkcon commented 2 weeks ago
if len(self._chunks) % self.chunk_num_per_shard == 0:

AttributeError: 'ChunkedDatasetBuilder' object has no attribute 'chunk_num_per_shard'

t1101675 commented 2 weeks ago

Fixed.

snowkcon commented 1 week ago

about grouped_infer

scripts/miniplm/difference_sampling/1.8B.sh

t1101675 commented 1 week ago

It seems the arguement is removed by mistake during code cleaning. The bug is fixed. Thanks for pointing it out!

snowkcon commented 1 week ago
  1. construct_pretrain_data.py No method call
  2. readme about construct_pretrain_data.py file no prompt to enter ratio
t1101675 commented 1 week ago

Fixed.

snowkcon commented 1 week ago

about readme

Vanilla KD bash scripts/vanilla_kd/qwen/200M.sh /PATH/TO/MiniPLM bash scripts/vanilla_kd/qwen/500M.sh /PATH/TO/MiniPLM bash scripts/vanilla_kd/qwen/1.2B.sh /PATH/TO/MiniPLM SeqKD bash scripts/seqkd/qwen/200M.sh /PATH/TO/MiniPLM bash scripts/seqkd/qwen/500M.sh /PATH/TO/MiniPLM bash scripts/seqkd/qwen/1.2B.sh /PATH/TO/MiniPLM

t1101675 commented 1 week ago

Fixed