-
expected JSON response:
```json
{
"url": "https://api.github.com/repos/octocat/Hello-World/pulls/1347",
"id": 1,
"node_id": "MDExOlB1bGxSZXF1ZXN0MQ==",
"html_url": "https://github.com/…
-
We should have an ability to remove repositories with `/DELETE/repos{owner}/{repo}` endpoint. The responses we expect:
in case of successful repo deletion:
```text
Status: 204
```
in case o…
-
expected JSON body we should process:
```json
{
"title":"Amazing new feature",
"body":"Please pull these awesome changes in!",
"head":"octocat:new-feature",
"base":"master"
}
```
…
-
This is a really great local llm backend that works on a lot of platforms
(including intel macs) and is basically a 1-click install.
**Main site:** https://ollama.ai/
**API dosc:** https://githu…
-
When I typed in the following command,
`torchrun --nproc_per_node=4 run_casp.py scripts/configs/train_casp_moco.yaml`
I found that the training was progressing too slowly, taking over 20 hours.…
-
@l3r8yJ WDYT?
-
# URL
- https://arxiv.org/abs/2403.05313
# Affiliations
- Zihao Wang, N/A
- Anji Liu, N/A
- Haowei Lin, N/A
- Jiaqi Li, N/A
- Xiaojian Ma, N/A
- Yitao Liang, N/A
# Abstract
- We explore …
-
Base Llama models use 10000 for RoPE theta. Code Llama models use 1000000 for dealing with a larger context.
With the current hardcoded value, the code models tend to close extra parentheses, whereas…
-
Idk what your plan is with this project so I just wanted to ask if you want to grow it and advance into enabling
- more available models (7B, 13B), CodeLLama, ...
- support quantized models
- impro…
-
Hello,
Have you checked for what happens when the n_heads != n_kv_heads? How does this affect the Rope rotation, MHA which now becomes GQA?