-
在modeling.py的1129行附近
#first_token_tensor = sequence_output[:, 0]
first_token_tensor, pool_index = torch.max(sequence_output, dim=1)
我调试看到sequence_output是一个[8,46,1034]维的tensor,为什么要用在1维上的max来处理它呢?…
-
### 🚀 The feature, motivation and pitch
Torch's embedding layers only accept int32 and int64 as input. However, for sequences with a small number of distinct possible tokens (e.g., ASCII character em…
-
```
Modeling with ranch...
* Ranch failed to build the chain:
ERROR: Unknown residue (?) detected.
Preceding sequence: SVSVAALLTVVFYIAAVMATNLYGATFPEWFGDLSKSLYTLFQVMTLESWSMGIVRPVMNVHPNA…
-
- Removing the `-W ignore` flag in #1362 caused warnings emitted during pytest run to be shown again (see below for a list)
- Ideally, this list would be empty:
- Expected warnings should be inspe…
-
Hi,
Thanks for providing and presenting this nice work.
As mentioned in your paper, your attention pattern for modeling long sequences can be plugged into any pretrained transformer model.
I wond…
-
Hi,
Have you tried quantizing Mamba? Do you plan on releasing quantized versions?
Can you share your thoughts on quantizing Mamba, given the sensitivity of the model's recurrent dynamics?
Thanks
-
## *Repository Creation Request*
1. #### Coordinating Institute: Indian Institute of Technology Kharagpur
2. #### Lab Name: Software Engineering Virtual Lab
3. #### Approved Proposal: Not Appli…
-
We are doing some very long sequence to sequence modeling. I believe I have found an buffer overflow in Label-studio. Annotating one very long time-series caused the annotation type to change and it…
-
### Describe the bug
I've been using SpeechBrain Wav2Vec2 training recipe (with HF integration) on my own data, and noticed that I get significantly different metrics with the same model on validat…
-
Dear all,
find the Sequence Diagramm for this US on:
https://github.com/openETCS/modeling/blob/master/User%20Stories/User%20Story%205.pdf