mlp-architecture Search Results

1000+ results
for mlp-architecture

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

arcee-ai/mergekit #342

Need some help in merging same architectures, but with diffe…

Hello! I actually have two models - CodeLLaMa-13b-Python and CodeLLaMa-13b, that need to be merged. The overall goal is to merge two models (one trained on Python and another trained on any other lan…

choprahetarth updated 4 months ago
4
Theano/Theano #4087

Possible improvements in speed of Jacobian computations on G…

The current Jacobian helper function implementation in `gradient.py` seems like it might be slower than it needs to be. Given a vector output (size _N_) of a graph with a vector input (size _M_), it c…

matt-graham updated 7 years ago
5
state-spaces/mamba #7

About max token length

What is the max token length that this model can support? Can it support more than 10k?

RevolGMPHL updated 9 months ago
28
NVIDIA-Merlin/dataloader #124

Feed pre-trained embeddings to NVTabular

**What is your question?** I have a dataset that includes a column feature of pre-trained embeddings. I couldn't find any documentations or examples on how this column should be passed to NVTabular. …

MelissaKR updated 1 year ago
6
OscarSavolainenDR/Quantization-Tutorials #12

Question about PTQ fake quantized FP32 model to INT8

Hi Oscar, First of all many thanks for your tutorials, they are incredibly useful to learn quantization and get hands-on experience on this! I have the following situation and perhaps you could …

fabriziojpiva updated 5 months ago
7
cchallu/n-hits #11

I can't see the model's detail in the code

I want see the model's detail in the code,but i found the Pytorch Lightning in the pycharm can't debug, they just run,how can i see the training data flows in the model? And it will makes me understan…

signalworker123 updated 2 years ago
7
Lotayou/Face-Renovation #11

Training error

Hello again, I got this training error when running "train.py", how can I solve this? ``` (hiface) G:\HiFaceGAN\Face-Renovation-master>python train.py train.py dataset [TrainDataset] of size 7 w…

bycloudai updated 4 years ago
3
vllm-project/vllm #8315

[Usage]: Correct way to load lora model

### Your current environment ```text Collecting environment information... PyTorch version: 2.2.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A …

xyg-coder updated 2 months ago
1
gusdlf93/Paper_Survey #17

[2022 arXiv] EfficientFormer : Vision Transformers at Mobile…

한줄 평 : 우리 모델, 빠름. 가벼움. 쓰셈 Transformer와 관련해서 다양한 모델들이 나왔습니다. 이들 중에서 장점만을 모아서, 가장 Efficiency가 좋은 모델을 만들었습니다. Observation 1 : Patch Embedding -> Convolution Stem Larger Kernel과 stride를 사용하는 Pat…

gusdlf93 updated 2 years ago
1
NVIDIA/TensorRT-LLM #1770

Fail to build w4a8_awq on Llama 13b

### System Info ubuntu 20.04 tensorrt 10.0.1 tensorrt-cu12 10.0.1 tensorrt-cu12-bindings 10.0.1 tensorrt-cu12-libs 10.0.1 tensorrt-llm …

Hongbosherlock updated 21 hours ago
12

上一页 1...39 40 41 42 43 44 45...100 下一页

1000+ results for mlp-architecture

1000+ results
for mlp-architecture