Open rezzie-rich opened 2 months ago
Llamafying it won't cause license issues since it's just a re-arrangement of modules. I'll try but best to llama-fy it for now
thank you, looking forward to it. if possible then the 1M context version :D if not 200k will work too.
their benchmark shows it performs best up to 200k context window before losing some quality.
will it be too much to ask to request for a commercial usage license for internLM from the creator for the unsloth version? they offer it for free upon request. If you guys obtain it then it becomes easier for anyone using the unsloth version of internLM without needing to request that again.
Apologies for the delay - hmm I think its the engineer themselves (ie yourself) who has to request it - we can request it for our own use, but unsure on distributing it through ourselves
we can request it for our own use, but unsure on distributing it through ourselves
Maybe that can be confirmed during the request since llamafying it will make it a different model, architecturally.
I have llamafied InternLM2.5-7B, and tried to open it in Unsloth.
I get
/usr/local/lib/python3.10/dist-packages/unsloth/models/llama.py in LlamaAttention__init__(self, config, layer_idx)
ValueError: Unknown RoPE scaling type dynamic
Open LLM Leaderboard also seems to be having trouble with its dynamic rope_scaling: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard/discussions/862
In the other closed issue, you mentioned that RoPE scaling can be disabled in order to finetune the model. I will try that.
Wait what's "dynamic" RoPE scaling - the only accepted ones are linear rope scaling, NTK, YARN, llama-3 type etc
I actually have no idea, will probably need to read the internlm remote code: https://huggingface.co/internlm/internlm2_5-7b/blob/main/modeling_internlm2.py
good news: if unsloth makes an internLM version ( both bf16 and int4 ), it can be released under the Apache-2.0 license since the original model comes under the Apache-2.0 license for research purposes. Anyone using the unsloth version will then only be bound by that model's license since they aren't using the original model.
It would be highly appreciated if a 200k context window version of the model is released by unsloth. This model has the best needle benchmark score compared to any other.
The original model is nor Apache-2.0 for research purposes, only the inference code is. However, models are probably not copyrightable in the US. The best way to get it licensed under Apache-2.0 is to ask for a license.
The original model is nor Apache-2.0 for research purposes, only the inference code is. However, models are probably not copyrightable in the US. The best way to get it licensed under Apache-2.0 is to ask for a license.
it's from the model card: Open Source License The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow free commercial usage. To apply for a commercial license, please fill in the application form (English)/申请表(中文). For other questions or collaborations, please contact internlm@pjlab.org.cn.
The original model is nor Apache-2.0 for research purposes, only the inference code is. However, models are probably not copyrightable in the US. The best way to get it licensed under Apache-2.0 is to ask for a license.
it's from the model card: Open Source License The code is licensed under Apache-2.0, while model weights are fully open for academic research and also allow free commercial usage. To apply for a commercial license, please fill in the application form (English)/申请表(中文). For other questions or collaborations, please contact internlm@pjlab.org.cn.
That explicitly says that only the code is under Apache-2.0. The model weights are available under an unspecified license which prohibits commercial use, and you can get a commercial license by applying with the application form.
It clearly states that the license requirement is only for commercial use. Otherwise, it's open under Apache
It clearly says that only the code is licensed under Apache-2.0. Anyways, it would be best to contact them, as they have not revealed the details of their public license.
can we please get official support for internLM-2.5?
I have seen a closed issue regarding that #734. however, the model mentioned there might be broken as it fails to load for instance.
It would be great to get an official version from you guys since the model has a lot of potential due to its size and context window.
additional question: does llamafing a model pose any licensing restriction from llama? if so it would be hugely appreciated if the supported internLM is not restricted by any llama licensing agreement.