microsoft / TransformerCompression

For releasing code related to compression methods for transformers, accompanying our publications
MIT License
354 stars 31 forks source link

Add QuaRot (RTN) [WIP] #130

Closed nailimixaM closed 4 months ago

nailimixaM commented 5 months ago

Questions:

Reviewers, please ignore the following files as they are exact duplicated from the slicegpt side of the codebase:

pashminacameron commented 5 months ago

Questions:

  • Adding quarot as a new package under src/: should we modify the toml so that if someone wants only one of SliceGPT or QuaRot we don't install extra dependencies only used by the other? e.g. fast_hadamard_transform is only used by QuaRot
  • Should we be moving non-SliceGPT-specific things like model downloading from HF, distributing models over GPUs, dataloading, etc to a common dir?
  • Bump up the version number in toml file?

Yes to all. It will be a major version bump, so we can make breaking changes. It might also be a good opportunity to move to package versions for transformers and lm-eval rather than using specific git commits.