mlp Search Results - Githubissues

1000+ results
for mlp

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

kube-reporting/metering-operator #982

S3 data remains when deleting datasource.

Hi. I installed using release-4.2. Hive uses s3Compatible. ``` apiVersion: metering.openshift.io/v1 kind: MeteringConfig metadata: name: "operator-metering" spec: disableOCPFeatures: tru…

JooyoungJeong updated 4 years ago
3
kyegomez/zeta #247

Should I use linear layers for the input and output of Flash…

I'm curious, do I have to use the linear layer respectively first before I input qkv to FlaskAttention? When I get the output from FlaskAttention, do I still need the linear layer? I look forward t…

chenhengx0101 updated 1 month ago
2
OliverRensu/SG-Former #2

An error occurred when importing pre-trained model parameter…

Starting Pretrained B_SGFormer Model SGFormer Traceback (most recent call last): File "trainval.py", line 135, in main() File "trainval.py", line 56, in main model = build_model(cfg)…

lihuinian updated 11 months ago
2
yangjianxin1/ClipCap-Chinese #3

GPT2模型加载以及MappingNetwork

感谢您的工作！在实验过程中，我发现了以下问题： 1. model.py中gpt2的加载报错 ` try: self.gpt2 = GPT2LMHeadModel.from_pretrained(gpt2_path) logger.info('succeed to load pretrain gpt2 model') …

clearwho updated 1 year ago
1
horovod/horovod #3974

Does Horovod support hybrid parallelism with differing ranks…

Does Horovod support the following parallelism setup? => pipeline parallelism, but different stages of the pipeline have different number of data parallel ranks. For example, consider a model which…

hsezhiyan updated 1 year ago
1
Lightning-AI/litgpt #1729

use initial_checkpoint_dir for continue-pretraining but can'…

I want to continue-pretraining my custom model in another dataset, so i only change initial_checkpoint_dir in training.yaml with the latest-run checkpoint dir path, but seems like the model can't be l…

wodelt updated 1 week ago
6
InternLM/lmdeploy #1224

[Feature] change InternLM2 modeling to unified type

### Motivation when do the w8a8 quantization in pytorch engine, I found that InternLM2 modeling like. It use self.attention, self.feed_forward... ```python class InternLM2DecoderLayer(nn.Module)…

yinfan98 updated 6 months ago
1
google-deepmind/alphageometry #63

Test lm_inference_test.py fails

Hello team, Please i need help to solve this issue, the test is failing: python lm_inference_test.py --meliad_path=$MELIAD_PATH --data_path=$DATA I0130 03:37:30.642391 139830854076224 nn_comp…

stephtchoko updated 6 months ago
1
Mattdl/DUA #4

Can't find single merged model

Hello, I have been trying to execute the script “_exp/exps_Numbers.sh_” to reproduce the results for the MNIST-SVHN based Numbers dataset. However, I have been running into a few issues and would a…

amala-wilson updated 3 years ago
2
lucidrains/vit-pytorch #289

Questions about distill_loss

sorry to bother, I see the distill_loss in distill.py as : distill_loss = F.kl_div( F.log_softmax(distill_logits / T, dim=-1), F.softmax(teacher_logits / T, dim=…

haoren55555 updated 3 months ago
3

上一页 1...89 90 91 92 93 94 95...100 下一页

1000+ results for mlp

1000+ results
for mlp