-
**Describe the bug**
when i train this model by config below (train_micro_batch_size_per_gpu=100),
raise runtime error.
but i try to set `train_micro_batch_size_per_gpu < 100`. it works.
but i w…
-
Hi,
I succeeded in running SFT and RM training in multi-gpu environment.
With the two learnt models, I tried to run RL training again in multi-gpu setup:
- 4 gpu (g5.x12large)
- CUDA11.7
- …
-
报错信息:
```shell
terminate called after throwing an instance of 'paddle::platform::EnforceNotMet'
what(): Invoke operator fill_constant error.
Python Callstacks:
File "/home/work/zhaoyijin/dis…
-
PixelFlasher 4.8.2.0 started on :2023-04-10 15:28:43
Platform: win32
System Timezone: ('US Mountain Standard Time', 'US Mountain Daylight Time') Offset: -7.0
Configuration Path: C:\Users\rland\AppD…
-
Hi,
I'm having problem running the cylinder example.
I start with singularity:
```
singularity shell scratch/ai/drlinfluids/DRLinFluids.sif
drl
of8
```
then to the dir of the cylinde…
-
I have used BERT NextSentencePredictor to find similar sentences or similar news, However, It's super slow. Even on Tesla V100 which is the fastest GPU till now. It takes around 10secs for a query tit…
-
Hello,
I am using the newest pull from the master branch of infercnv. I encountered an error at STEP18, which throws the following error:
STEP 18: Run Bayesian Network Model on HMM predicted C…
-
After i run the script train_config.yaml i get this error below:
2023-04-09 13:40:38.702636: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
/usr/lo…
-
&&&& RUNNING TensorRT.trtexec [TensorRT v8205] # trt.exe --onnx=last.onnx --saveEngine=last.engine
[04/06/2023-11:03:40] [I] === Model Options ===
[04/06/2023-11:03:40] [I] Format: ONNX
[04/06/2023…
-
Hi
thanks for sharing your code. I am trying to train your network, however when I do I get the following errors. Apparently there is a missing folder, which contains a group of jpg files. I did a …