Closed XinDongol closed 1 month ago
Hello. Yes, you can still build a model without PipelineBlock, but you'll need to make some modifications because the current version of Nanotron's Trainer isn't designed for this
For example, you could bypass the build_model function [link], and directly initialize the model weights.
When build a customized model, do we need to make sure all blocks are
PipelineBlock
? Was trying to build a model with only data parallism and tensor parallism but got error in linethresholds = [block_cumulative_costs[-1] * ((rank + 1) / pp_size) for rank in range(pp_size)]