-
Hi,
I really like your project as it provides an easy-to-use approach. I have been thinking that since the new Llama 3.1 is multilingual, could this approach also be used in that way? As we are on…
-
Hi!
In the research paper, the authors tackle many different problems using the same base architecture, it is one the main strength of this article. Unfortunately, the actual version of the code on…
-
Hi all, I'm trying to train SBERT to classify 2 sentences as being duplicates or not using set fit. How do I make it so that "column_mappings" exepts 2 sentences instead of one?
Below is the code I…
-
As another step toward a strongly-typed language with a functional flavor, wenyan-lang now supports ML style [static typechecking](https://en.wikipedia.org/wiki/Type_system#Static_type_checking) and […
-
Currently, we use SentencePiece in Tokenizer for our models contain ZH/JA in which no space serves as a natural word boundary.
The SentencePiece model is applied after Tokenizer's `none` mode.
`node…
-
Hi
Since I don't have access to GPU, I can't execute your code, but there is another code in the github that implements your model with the keras Library . Are you confirming the following code and c…
-
-
Hi, any thoughts on the new sparsity features of the nvidia Ampere gpus? Looks like it could give a big speed improvement if applicable:
[A100 whitepaper](https://www.nvidia.com/content/dam/en-zz/…
-
- [ ] [The Scaling Hypothesis · Gwern.net](https://gwern.net/scaling-hypothesis)
# The Scaling Hypothesis · Gwern.net
**DESCRIPTION:** "GPT-3, announced by OpenAI in May 2020, is the largest neura…
-
我参考TIR的prompt在qwen2.5-Math的1.5B和7B模型上进行了实验,得到的指标结果比COT差,我怀疑我的实现缺少了一些步骤,能说明下更详细的实现方式嘛?
我参考下面的prompt实现了TIR
```
# TIR
messages = [
{"role": "system", "content": "Please integrate natural languag…