-
I can obtain episode reward mean from the train result, but the fluctuation is very large, and it is difficult to judge when to stop the training iteration, so I hope to use the result of evaluate.
…
-
-
-
I was doing some code completion evaluation using [codefuseEval](https://github.com/codefuse-ai/codefuse-evaluation/tree/master/codefuseEval), on the **Qwen2.5-Coder base model**. When I ran a Java ev…
-
To get ```train.py``` to run fine with the ```--evaluate``` flag, I had to modify the ```load()``` method of ```SaveableRNNLM()``` to return ```vocab``` as well as ```lm```. Otherwise running the eva…
-
Full JSON schema: {'$defs': {'ChangeType': {'enum': ['addition', 'modification', 'deletion'], 'title': 'ChangeType', 'type': 'string'}, 'CodeSpan': {'properties': {'file_path': {'description': 'The fi…
-
### Describe the bug
Hello authors,
I had followed exactly the description for implementation on the jupyter notebook, however, when it came to testing, I got this bug.
```
----------------------…
-
- **Is your feature request related to a problem? Please describe:**
The current face expression recommendation system uses MobileNet, and there is a need to evaluate a custom CNN model built from s…
-
Hello,I only want to reproduce the evaluation results on the dev set from the paper. When evaluating the English language, is it necessary to embedding the entire 15-million corpus first? This process…
-
Can you provide the evaluation code? When I tested it on the MMBenchmark with a 1B model, the performance was quite low, only around 19.