-
- [ ] [DeepSeek-V2: A Strong, Economical, and Efficient MoE LLM of 236B total parameters](https://github.com/deepseek-ai/DeepSeek-V2)
# DeepSeek-V2: A Strong, Economical, and Efficient MoE LLM of 2…
-
I am using the `bfastlite()` function to run a time-series analysis. From the author's [paper](https://www.mdpi.com/2072-4292/13/16/3308) (table 2), I quote:
> Needs parameter tuning to optimise …
-
During the hackathon I just picked a model without knowing too much about an ideal model fit. Think better and change if needed
-
[Cross-document Coreference Resolution over Predicted Mentions](https://aclanthology.org/2021.findings-acl.453.pdf)
=========
## Contribution summary
- Cattan et al. proposed the first end-to-end m…
-
https://arxiv.org/abs/1907.11093
-
I adapted the scripts to use the newly released, open source Cerebras model.
I trained a LoRA with Cerebras 1.3B so far, on Alpaca cleaned (1h on dual 3090)
Base https://huggingface.co/cerebras/Ce…
-
### System Info
```shell
Hi Team,
Need your help to convert owl-vit model (OwlViTForObjectDetection) into onnx file.
############################################################################…
-
Hello, nice work.
I would like to ask when the trained model will be open sourced and how I can call the trained model input text to predict the stock price.
Thank you.
-
**Is this a BUG REPORT or FEATURE REQUEST?**:
> Uncomment only one, leave it on its own line:
>
> /kind bug
> /kind feature
**What happened**:
Now we store the model in one layer.
*…
-