Open PriyaBSavithiri opened 5 months ago
Hi,
I am trying to run RESNET50 code by official command without any modification but facing segmentation fault (core dumped) issue when running in both Baremetal and Docker.
COMMAND: bash run_offline.sh 1
LOG: Docker:
[root@a29df81b85ee pytorch-cpu]# bash run_offline.sh 1 user_default.conf section default default resnet50..performance_sample_count_override = 1024 custom resnet50..performance_sample_count_override = 1024 default .Offline.target_qps = 25150 custom .Offline.target_qps = 37725.0 default .Server.target_qps = 19810 custom .Server.target_qps = 29715.0 default .Server.min_duration = 600000 custom .Server.min_duration = 600000 [SUT] Creating instance 0 run_offline.sh: line 85: 2859 Segmentation fault (core dumped) $numactl ${APP} --scenario Offline --mode Performance --mlperf_conf ${CUR_DIR}/src/mlperf.conf --user_conf ${USER_CONF} --model_name resnet50 --rn50-part1 ${RN50_START} --rn50-part3 ${RN50_END} --rn50-full-model ${RN50_FULL} --data_path ${DATA_DIR} --num_instance $number_cores --warmup_iters 20 --cpus_per_instance $CPUS_PER_INSTANCE --total_sample_count 50000 --batch_size $1
Baremetal:
(rn50-mlperf)/home/user:~/inference_results_v4.0/closed/Intel/code/resnet50/pytorch-cpu$ bash run_offline.sh 1 user_default.conf section default default resnet50..performance_sample_count_override = 1024 custom resnet50..performance_sample_count_override = 1024 default .Offline.target_qps = 25150 custom .Offline.target_qps = 37725.0 default .Server.target_qps = 19810 custom .Server.target_qps = 29715.0 default .Server.min_duration = 600000 custom .Server.min_duration = 600000 [SUT] Creating instance 0 run_offline.sh: line 85: 3835024 Segmentation fault (core dumped) $numactl ${APP} --scenario Offline --mode Performance --mlperf_conf ${CUR_DIR}/src/mlperf.conf --user_conf ${USER_CONF} --model_name resnet50 --rn50-part1 ${RN50_START} --rn50-part3 ${RN50_END} --rn50-full-model ${RN50_FULL} --data_path ${DATA_DIR} --num_instance $number_cores --warmup_iters 20 --cpus_per_instance $CPUS_PER_INSTANCE --total_sample_count 50000 --batch_size $1
Anyone facing the same problem?
Thanks in advance.
Hi,
I am trying to run RESNET50 code by official command without any modification but facing segmentation fault (core dumped) issue when running in both Baremetal and Docker.
COMMAND: bash run_offline.sh 1
LOG: Docker:
[root@a29df81b85ee pytorch-cpu]# bash run_offline.sh 1 user_default.conf section default default resnet50..performance_sample_count_override = 1024 custom resnet50..performance_sample_count_override = 1024 default .Offline.target_qps = 25150 custom .Offline.target_qps = 37725.0 default .Server.target_qps = 19810 custom .Server.target_qps = 29715.0 default .Server.min_duration = 600000 custom .Server.min_duration = 600000 [SUT] Creating instance 0 run_offline.sh: line 85: 2859 Segmentation fault (core dumped) $numactl ${APP} --scenario Offline --mode Performance --mlperf_conf ${CUR_DIR}/src/mlperf.conf --user_conf ${USER_CONF} --model_name resnet50 --rn50-part1 ${RN50_START} --rn50-part3 ${RN50_END} --rn50-full-model ${RN50_FULL} --data_path ${DATA_DIR} --num_instance $number_cores --warmup_iters 20 --cpus_per_instance $CPUS_PER_INSTANCE --total_sample_count 50000 --batch_size $1
Baremetal:
(rn50-mlperf)/home/user:~/inference_results_v4.0/closed/Intel/code/resnet50/pytorch-cpu$ bash run_offline.sh 1 user_default.conf section default default resnet50..performance_sample_count_override = 1024 custom resnet50..performance_sample_count_override = 1024 default .Offline.target_qps = 25150 custom .Offline.target_qps = 37725.0 default .Server.target_qps = 19810 custom .Server.target_qps = 29715.0 default .Server.min_duration = 600000 custom .Server.min_duration = 600000 [SUT] Creating instance 0 run_offline.sh: line 85: 3835024 Segmentation fault (core dumped) $numactl ${APP} --scenario Offline --mode Performance --mlperf_conf ${CUR_DIR}/src/mlperf.conf --user_conf ${USER_CONF} --model_name resnet50 --rn50-part1 ${RN50_START} --rn50-part3 ${RN50_END} --rn50-full-model ${RN50_FULL} --data_path ${DATA_DIR} --num_instance $number_cores --warmup_iters 20 --cpus_per_instance $CPUS_PER_INSTANCE --total_sample_count 50000 --batch_size $1
Anyone facing the same problem?
Thanks in advance.