microsoft / torchscale

Foundation Architecture for (M)LLMs
https://aka.ms/GeneralAI
MIT License
3k stars 201 forks source link

fix a bug that overrides the default constructed output_projection #9

Closed MatthewChang closed 1 year ago

MatthewChang commented 1 year ago

Fixes a bug that overrides the default constructed output_projection when None is passed in.

L228-235

 if (
     output_projection is None
     and not args.no_output_layer
     and args.vocab_size > 0
 ):
     self.output_projection = self.build_output_projection(args)
 else:
     self.output_projection = output_projection

seems to already handle this properly.

This PR removes a line that overrides the above unnecessarily.

shumingma commented 1 year ago

Good catch! Thanks!