am-softmax Search Results

LCorleone/Various-Loss-Function-in-Face-Recognition #1

Unknown loss function:AM-softmax

I cloned this project and imported to my own python code. I updated my code as below: siamese_net.compile(loss="AM-softmax",optimizer=optimizer, metrics=['accuracy']) However, I got below erro…

justshot updated 4 years ago

oneapi-src/oneDNN #1814

running destructors before completion of a primitive

I want to launch a primitive in a function and return immediately. I am getting a segv that looks like a memory corruption. If I wait for completion before returning, it works. I think the problem is …

rscohn2 updated 6 hours ago

dvgodoy/PyTorchStepByStep #54

extended the example of scaling dot product code

So I am into attention network , one of the toughest to understand and book so far explains great, however on scaled dot product example, show scaling by 100 the product of ks,q skews. I extended this…

jdgh000 updated 1 week ago

HarunoriKawano/speaker-identification-with-tgp #1

Testing method?

Hi, Could you please provide details on how you test the model? Specifically, I am interested in understanding how the model is tested when using AAM Softmax for training. When testing, are labe…

mayuravaani updated 2 weeks ago

RetroCirce/HTS-Audio-Transformer #62

[The testing result on a siren audio file seems not working …

Hi @RetroCirce , I want to thank you for your great work!!!! I am playing around the model checkpoint you provided (`HTSAT_AudioSet_Saved_6.ckpt`). I can successfully load the model checkpoint and m…

allenhung1025 updated 3 weeks ago

pytorch/pytorch #129657

Implementation of a numerically stable log(1 - softmax) func…

### 🚀 The feature, motivation and pitch Hello, I am working on editing knowledge in transformers which, in the case of knowledge deletion, requires the minimization of the likelihood of targeted…

HichemAK updated 5 days ago

microsoft/tutel #240

Question regarding the load importance loss calculation

Hi, when studying the load importance loss, I found the parameters passed to the function `load_importance_loss` are softmax normalized scores and logits with noise (see [moe_layer.py L281](https://gi…

wangyirui updated 2 months ago

Dao-AILab/flash-attention #1169

FP8 for flash attention 3 and possible concerns

Thank you for this amazing work! I was wondering if the fp8 implementation of flash attention 3 will be able for public to use? My main concern will be accuracy (block quant may have alleviated thi…

TheTinyTeddy updated 2 weeks ago

ShunChengWu/3DSSG #63

From Predictions to Scene Graph

Hello, thank you for this great work and the provided resources. I have a question regarding how to get a scene graph from the models predictions. I am using the config file `configs/config_3D…

BlueBell93 updated 1 week ago

NVIDIA/TensorRT-LLM #2088

return-generation-logits bug when fp8 enabled

I am running llama3 model on an rtx4090 with fp8 quantization. In the [result](https://github.com/NVIDIA/TensorRT-LLM/blob/main/cpp/include/tensorrt_llm/executor/executor.h#L323), `outputTokenIds` see…

binhtranmcs updated 1 day ago

1000+ results for am-softmax

1000+ results
for am-softmax