NVIDIA / NeMo-Aligner

Scalable toolkit for efficient model alignment
Apache License 2.0
419 stars 45 forks source link

add packed dataset #181

Open gshennvm opened 1 month ago

gshennvm commented 1 month ago

in aligner we should pack the dataset along the sequence dimension when training. This will be good for CP and training performance.

lifan-yuan commented 3 weeks ago

@gshennvm @aaronp24 @odelalleau @tmfs10 @lukeyeager Any plan on implementing this feature? 👀