Closed SurvivorNo1 closed 1 year ago
您提供一下训练命令和原yml配置文件的路径哈
您提供一下训练命令和原yml配置文件的路径哈
很感谢您的回复: 下面是我在notebook中的训练命令:
%cd ~/PaddleSeg/contrib/MedicalSeg
!python train.py --config configs/msd_brain_seg/unetr_msd_brain_seg_1e-4.yml \
--do_eval --save_interval 1000 --has_dataset_json False --is_save_data False --num_workers 4 --log_iters 10 --use_vdl
下面是我的yml文件路径
/home/aistudio/PaddleSeg/contrib/MedicalSeg/configs/msd_brain_seg/unetr_msd_brain_seg_1e-4.yml
下面是我的配置文件,和官方对比有修改,修改了
in_channels,num_classes,数据集路径,train_dataset的transforms
data_root: /tmp/Amos2022
batch_size: 4
iters: 30000
train_dataset:
type: msd_brain_dataset
dataset_root: phase0/
result_dir: phase0/labels
num_classes: 16
transforms:
- type: RandomRotation4D
degrees: 90
rotate_planes: [[1, 2], [1, 3],[2, 3]]
- type: RandomFlip4D
flip_axis: [1,2,3]
prob: 0.1
mode: train
val_dataset:
type: msd_brain_dataset
dataset_root: phase0/
result_dir: phase0/labels
num_classes: 16
transforms: []
mode: val
dataset_json_path: "/tmp/Amos2022/raw/dataset.json"
test_dataset:
type: msd_brain_dataset
dataset_root: phase0/
result_dir: phase0/labels
num_classes: 16
transforms: []
mode: test
dataset_json_path: "/tmp/Amos2022/raw/dataset.json"
optimizer:
type: AdamW
weight_decay: 1.0e-4
lr_scheduler:
type: PolynomialDecay
decay_steps: 30000
learning_rate: 0.0001
end_lr: 0
power: 0.9
loss:
types:
- type: MixedLoss
losses:
- type: CrossEntropyLoss
weight: Null
- type: DiceLoss
coef: [1, 1]
coef: [1]
model:
type: UNETR
img_shape: (128, 128, 128)
in_channels: 1
num_classes: 16
embed_dim: 768
patch_size: 16
num_heads: 12
dropout: 0.1
你好, 预处理数据为CHW的格式即可,BCHW为通过dataloader加载之后生成的batch纬度。 另外根据错误提示,这个是数据加载的dataloader过程中报错,请排查数据返回是否有异常,即单独调试dataset.py,查看返回结果是否存在None或者其他异常数值?
你好, 预处理数据为CHW的格式即可,BCHW为通过dataloader加载之后生成的batch纬度。 另外根据错误提示,这个是数据加载的dataloader过程中报错,请排查数据返回是否有异常,即单独调试dataset.py,查看返回结果是否存在None或者其他异常数值?
谢谢您的回复,那该配置文件中的transforms是在BCHW上处理还是在CHW上处理呢?换句话说我配置文件中transforms要改成4D还是3D呢?
transforme在dataloder之前,都是3D的数据,dataloader之后才是4D的。
transforme在dataloder之前,都是3D的数据,dataloader之后才是4D的。
你好,按照您的建议,我将数据和标签均预处理为(128,128,128)并对应修改了transforms后重新测试了一下,依然没跑起来。数据是否应该处理为 (1,128,128,128)?
以下是我的训练命令:
%cd ~/PaddleSeg/contrib/MedicalSeg
!python train.py --config configs/msd_brain_seg/unetr_msd_brain_seg_1e-4.yml \
--do_eval --save_interval 1000 --has_dataset_json False --is_save_data False --num_workers 4 --log_iters 10 --use_vdl
训练日志:
/home/aistudio/PaddleSeg/contrib/MedicalSeg
2023-03-23 19:59:38 [INFO]
------------Environment Information-------------
platform: Linux-4.15.0-140-generic-x86_64-with-debian-stretch-sid
Python: 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0]
Paddle compiled with cuda: True
NVCC: Build cuda_11.2.r11.2/compiler.29618528_0
cudnn: 8.2
GPUs used: 1
CUDA_VISIBLE_DEVICES: None
GPU: ['GPU 0: Tesla V100-SXM2-32GB']
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~16.04) 7.5.0
PaddlePaddle: 2.4.0
------------------------------------------------
2023-03-23 19:59:38 [INFO]
---------------Config Information---------------
batch_size: 4
data_root: /tmp/Amos2022
iters: 30000
loss:
coef:
- 1
types:
- coef:
- 1
- 1
losses:
- type: CrossEntropyLoss
weight: null
- type: DiceLoss
type: MixedLoss
lr_scheduler:
decay_steps: 30000
end_lr: 0
learning_rate: 0.0001
power: 0.9
type: PolynomialDecay
model:
dropout: 0.1
embed_dim: 768
img_shape: (128, 128, 128)
in_channels: 1
num_classes: 16
num_heads: 12
patch_size: 16
type: UNETR
optimizer:
type: AdamW
weight_decay: 0.0001
test_dataset:
dataset_json_path: /tmp/Amos2022/raw/dataset.json
dataset_root: phase0/
mode: test
num_classes: 16
result_dir: phase0/labels
transforms: []
type: msd_brain_dataset
train_dataset:
dataset_root: phase0/
mode: train
num_classes: 16
result_dir: phase0/labels
transforms:
- degrees: 90
type: RandomRotation3D
- flip_axis:
- 0
- 1
- 2
prob: 0.1
type: RandomFlip3D
type: msd_brain_dataset
val_dataset:
dataset_json_path: /tmp/Amos2022/raw/dataset.json
dataset_root: phase0/
mode: val
num_classes: 16
result_dir: phase0/labels
transforms: []
type: msd_brain_dataset
------------------------------------------------
W0323 19:59:38.168220 6958 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W0323 19:59:38.168257 6958 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
Traceback (most recent call last):
File "train.py", line 232, in <module>
main(args)
File "train.py", line 227, in main
has_dataset_json=args.has_dataset_json)
File "/home/aistudio/PaddleSeg/contrib/MedicalSeg/medicalseg/core/train.py", line 138, in train
logits_list = model(images)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 948, in __call__
return self.forward(*inputs, **kwargs)
File "/home/aistudio/PaddleSeg/contrib/MedicalSeg/medicalseg/models/unetr.py", line 432, in forward
transf_input = self.embed(x)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 948, in __call__
return self.forward(*inputs, **kwargs)
File "/home/aistudio/PaddleSeg/contrib/MedicalSeg/medicalseg/models/unetr.py", line 181, in forward
patch_embeddings = self.patch_embeddings(x)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/dygraph/layers.py", line 948, in __call__
return self.forward(*inputs, **kwargs)
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/conv.py", line 1055, in forward
use_cudnn=self._use_cudnn,
File "/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/functional/conv.py", line 216, in _conv_nd
False,
ValueError: (InvalidArgument) The input's dimension and filter's dimension of Op(Conv) should be equal. But received: the input's shape is [4, 128, 128, 128], the input's dimension is 4; the filter's shape is [768, 1, 16, 16, 16], the filter's dimension is 5.
[Hint: Expected in_dims.size() == filter_dims.size(), but received in_dims.size():4 != filter_dims.size():5.] (at /paddle/paddle/phi/infermeta/binary.cc:483)
是的,我想我记错了,医疗数据预处理后的结果应该是CHWD而不是CHW,其中C=1。也就是你说的1,128,128,128
问题确认 Search before asking
请提出你的问题 Please ask your question
您好,我想使用自己的数据集在Unetr上训练,我的数据集算上背景有16类,是.nii.gz类型的CT数据。我将数据处理成(1,128,128,128),标签处理成(1,128,128,128),模型不能正常运行。提示
SystemError: (Fatal) Blocking queue is killed because the data reader raises an exception. [Hint: Expected killed_ != true, but received killed_:1 == true:1.] (at /paddle/paddle/fluid/operators/reader/blocking_queue.h:166)
然后,我将数据处理成(128,128,128),标签处理成(128,128,128),模型也不能正常运行,提示也一样。我的配置文件如下: