Closed fabianlim closed 3 months ago
@fabianlim I would also suggest dropping ParametertizedEmbedding
and ParametertizedLinear
and using the linear and embedding from torch directly
They are just for an experimental project I was working on.
@aldopareja this is more or less ok, but missing the notices. what do we want to put in the header of every file?
like this?
# this code has been extracted from https://github.com/ibm-granite/dolomite-engine
We should also add the same publishing CI that we use elsewhere in instructlab so that it's easy to get stuff published.
This one instructlab/training#31
Sorry not that one, this one: https://github.com/instructlab/training/pull/42
Nvm, I created a PR for publishing here, ignore the above comments: https://github.com/instructlab/GPTDolomite/pull/2
wow!!! 4k lines of code already :)
Yikes!! all of my comments were ignored 🤣
@mayank31398 I thought you only gave these 2 comments
Was there anything else?
This is the initial extraction from the dolomite engine repo.
Extracted models:
hf_models/models/gpt_dolomite
hf_models/models/moe_dolomite
~Conversion from HF supported
hf_models/model_conversion/bigcode
hf_models/model_conversion/llama
hf_models/model_conversion/mixtral
~TODO:
modeling_utils/normalization/rmsnorm/torchtitan.py
modeling_utils/normalization/rmsnorm/apex.py
modeling_utils/normalization/layernorm/apex.py
modeling_utils/normalization/layernorm/apex_persistent.py
modeling_utils/embedding/ParameterizedEmbedding
modeling_utils/linear/ParameterizedLinear