inference-platform Search Results

1000+ results
for inference-platform

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/onnxruntime #21220

Inference result different between cuda and cpu

### Describe the issue I tried to use CPUExecutionProvider and CUDAExecutionProvider to inference the same single conv node, and turns out the result does not match after 4 decimals. I'm wondering …

CapJunkrat updated 1 week ago
1
hpcaitech/Open-Sora #535

torch.cuda.OutOfMemoryError happened in the mirror of openso…

Hi, guys I have rented two a800 and chosen the mirror of opensora 1.1 on the cloud platform. But when I try to run the command below: **python scripts/inference-long.py configs/opensora-v1-1/…

reich208github updated 6 days ago
2
groq/groq-python #101

How to handle RateLimit error?

Hello, First of all, thanks for the Groq platform! I use it as llm backend to create agent. But it often comes up `RateLimit` error. I handled like this: ``` def inference(self, model: str,…

jacktang updated 1 week ago
1
huggingface/transformers #31245

🐛 `attn_implementation="sdpa"` slower than `BetterTransforme…

### System Info - `transformers` version: 4.41.1 - Platform: Linux-5.10.215-203.850.amzn2.x86_64-x86_64-with-glibc2.26 - Python version: 3.10.14 - Huggingface_hub version: 0.23.2 - Safetensors …

vibhas-singh updated 2 weeks ago
2
yl4579/StyleTTS2 #117

Is it possible to make onnx model support?

This is coolest model that I saw. Thank you for this prefect work. For using this to different platforms, is there any supports or information to make inference?

orkars updated 6 days ago
5
MIC-DKFZ/nnDetection #254

[Question] Installation without docker and model deployment.

### :question: Question I am sorry for back to back question. But this is very important for me. I previously used retinaNet for detection and i used 2d data but now i have shifted to 3d data. I am…

ArpanGyawali updated 2 weeks ago
9
kubeflow/kubeflow #7564

Support easier feature serving and model serving from Extern…

/kind feature ## **Why you need this feature:** At the moment, Model serving (via KServe) and Feature Serving (via Feast—the Feature Store) are separate components without any guidance on how to b…

franciscojavierarceo updated 4 days ago
1
pytorch/serve #3168

The service crashes if the model takes a long time to respon…

### 🐛 Describe the bug TorchServe version is 0.10.0. It's my code: ``` def get_inference_stub(address: str, port: Union[str, int]= 7070): channel = grpc.insecure_channel(address + ':' + str(p…

yurkoff-mv updated 2 weeks ago
6
ultralytics/ultralytics #14282

Why is my inference time about 170ms, which is far from the …

### Search before asking - [X] I have searched the YOLOv8 [issues](https://github.com/ultralytics/ultralytics/issues) and [discussions](https://github.com/ultralytics/ultralytics/discussions) and fou…

lwb9010 updated 7 hours ago
2
vllm-project/vllm #5537

[Bug]: CUDA illegal memory access error when `enable_prefix_…

### Your current environment ```text The output of `python collect_env.py` PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A …

mpoemsl updated 2 weeks ago
8

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for inference-platform

1000+ results
for inference-platform