-
I did not find such a cached method using past_key_values in the SAT. Is it possible to add this?
Thanks.
-
when I set increase max_frames from 16 to 32 then I see loss of temporal dynamics in the video after 16 frames.
Config I was using is
```
TASK_TYPE: inference_i2vgen_entrance
use_fp16: True
guid…
-
### Feature request
To build a generic script/pipeline which takes input as :
- Model name
- One or multiple recording
Then the pipeline should:
- Build prompts from events from recording.
…
-
Hi @ex3ndr , I check out your code here: https://github.com/ex3ndr/supervoice-gpt/blob/master/train_tokenizer.py
I saw you have tried two training one with text and the other is with phonemes any spe…
-
Currently the Transformer is not really implemented as it should. We should revisit to implement it like the in original Transformer paper; including always training for predicting next sample (like l…
hrzn updated
11 months ago
-
Milan basically implemented this StochasticStir branch at my request, in order for me to explore in more detail the ideas in [Barnes and Hartmann (2011)](https://doi.org/10.1175/JAS-D-11-039.1). In pa…
-
# Sunday
## Ch 7: GCP for Marketing
Pam Castricone (Head of Data Science) & Tyler Blatt (Head of Integrations)
- Pareto Negative Binomial Distribution model
- Non-contractual transactions wh…
-
- Static shapes
- Padding
- Efficiency/compilation time tradeoff
- Autoregressive text generation loop
-
# 🌟 New Model: BLOOM
## Model description
The BLOOM model has been proposed with its various versions through the [BigScience Workshop](https://bigscience.huggingface.co/). BigScience is inspired …
-
Hi, I was wondering regarding your code here.
https://github.com/mosaicml/composer/blob/a7cad7c221ce8ad9697bde50db0b3f37f8b8025e/composer/datasets/in_context_learning_evaluation.py#L655
Why do you…