mlp Search Results - Githubissues

tenstorrent/tt-metal #9723

Falcon7b prefill MLP perf optimizations

Part of: #8349 Goal is to optimise MLP mms (FF1 & FF2) on 128, 1024, 2048 sequence lengths. The max that we can hit is calculated using this formula, and is the same for FF1 & FF2: (M * N * K / (…

pavlepopovic updated 18 hours ago

jbloomAus/SAELens #182

[Proposal] Add MLP transcoders

### Proposal Support training, loading, and inference of MLP transcoders. ### Motivation MLP transcoders were trained by Jacob Dunefsky and Philippe Chlenski and have been shown to be usef…

dtch1997 updated 1 week ago

QwenLM/Qwen2 #714

加载量化模型，这种情况是正常的吗？一些权重未被使用？Some weights of the model checkpoi…

加载量化模型，这种情况是正常的吗 model = AutoModelForCausalLM.from_pretrained("/mnt/d/code_wsl/Qwen2-7B-Instruct-GPTQ-Int8", torch_dtype="auto", …

minxiansheng updated 21 hours ago

huggingface/text-generation-inference #2122

AttributeError: 'MixtralLayer' object has no attribute 'mlp'

### System Info 2024-06-26T08:59:14.473641Z ERROR text_generation_launcher: Error when initializing model Traceback (most recent call last): File "/opt/conda/bin/text-generation-server", line 8, …

icyxp updated 2 days ago

NCIOCPL/ncids #1759

R3/MLP Stakeholder Share-out Messaging

**Description:** Part of the R3/MLP effort includes sharing out of information with the individuals that were part of the stakeholder interviews conducted in April 2024. Agreed with Anna on 06/17 that…

laurelthrash updated 2 days ago

Eclectic-Sheep/sheeprl #305

`sheeprl_eval` loading model with different keys

I cannot _sheeprl-eval_ my trained model, since the keys in the world model's state_dict have different names: Stacktrace Error executing job with overrides: ['checkpoint_path=/home/drt/Deskto…

belerico updated 1 day ago

unslothai/unsloth #638

Can't load CodeLlama-13b

I would like to finetune CodeLlama-13b in a memory efficient way. I was able to do it with CodeLlama-7b, but failing with 13b. I can't load the model `unsloth/codellama-13b-bnb-4bit`: ```pyth…

user799595 updated 4 days ago

yan-hao-tian/VW #21

No MLP Mixer code found

Hi, In your first draft of paper on Arxiv, you mentioned that you are using MLP mixer to mix the channels but I don't see any code that uses MLP Mixer. Can you please clarify? If you removed the M…

vsingh1998 updated 5 days ago

NVIDIA/TensorRT #3930

A Simple Engine Build Fail on TensorRT 8.6

Hi, I build a simple onnx model on Tensorrt8.6. And I get an error: mha_fusion.cpp:344: DCHECK(fc1_ && fc2_ && softmax_) failed. Could not find any implementation for node {ForeignNode[onnx::MatMul…

Aktcob updated 1 week ago

eclipse-basyx/basyx-java-server-sdk #299

No MQTT Events send when changing MLPs

Hello, I was testing the MQTT event feature of the submodel repository. MQTT events for changed "Property" elements work just fine. However, changing values of "MultiLanguageProperty" elements (or …

de-ich updated 3 days ago

1000+ results for mlp

1000+ results
for mlp