SuperMedIntel / MedSegDiff

Medical Image Segmentation with Diffusion Model
MIT License
1.09k stars 166 forks source link

Brats dataset bug!!! #202

Open 1580827935 opened 4 days ago

1580827935 commented 4 days ago

Training on the Brats dataset yields errors, but can still be trained. Please tell me why these errors.

/home/ubuntu/anaconda3/envs/medsegdiff/bin/python /media/ubuntu/89696b21-997d-42b2-a1c0-b16d972fd7b72/hlq/MedSegDiff-master/scripts/segmentation_train.py --data_name BRATS --data_dir /media/ubuntu/89696b21-997d-42b2-a1c0-b16d972fd7b72/hlq/data/BraTS2020/training --out_dir './BRATS2020_results/' --image_size 256 --num_channels 128 --class_cond False --num_res_blocks 2 --num_heads 1 --learn_sigma True --use_scale_shift_norm False --attention_resolutions 16 --diffusion_steps 1000 --noise_schedule linear --rescale_learned_sigmas False --rescale_timesteps False --lr 1e-4 --batch_size 8 Setting up a new session... Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/urllib3/connection.py", line 199, in _new_conn sock = connection.create_connection( File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/urllib3/util/connection.py", line 85, in create_connection raise err File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/urllib3/util/connection.py", line 73, in create_connection sock.connect(sa) ConnectionRefusedError: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/urllib3/connectionpool.py", line 789, in urlopen response = self._make_request( File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/urllib3/connectionpool.py", line 495, in _make_request conn.request( File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/urllib3/connection.py", line 441, in request self.endheaders() File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/http/client.py", line 1251, in endheaders self._send_output(message_body, encode_chunked=encode_chunked) File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/http/client.py", line 1011, in _send_output self.send(msg) File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/http/client.py", line 951, in send self.connect() File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/urllib3/connection.py", line 279, in connect self.sock = self._new_conn() File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/urllib3/connection.py", line 214, in _new_conn raise NewConnectionError( urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f7d5d13ca30>: Failed to establish a new connection: [Errno 111] Connection refused

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/requests/adapters.py", line 667, in send resp = conn.urlopen( File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/urllib3/connectionpool.py", line 843, in urlopen retries = retries.increment( File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/urllib3/util/retry.py", line 519, in increment raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type] urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='localhost', port=8850): Max retries exceeded with url: /env/main (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7d5d13ca30>: Failed to establish a new connection: [Errno 111] Connection refused'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/visdom/init.py", line 756, in _send return self._handle_post( File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/visdom/init.py", line 720, in _handle_post r = self.session.post(url, data=data) File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/requests/sessions.py", line 637, in post return self.request("POST", url, data=data, json=json, kwargs) File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/requests/sessions.py", line 589, in request resp = self.send(prep, send_kwargs) File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/requests/sessions.py", line 703, in send r = adapter.send(request, **kwargs) File "/home/ubuntu/anaconda3/envs/medsegdiff/lib/python3.8/site-packages/requests/adapters.py", line 700, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8850): Max retries exceeded with url: /env/main (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f7d5d13ca30>: Failed to establish a new connection: [Errno 111] Connection refused')) [Errno 111] Connection refused on_close() takes 1 positional argument but 3 were given Exception in user code:

Logging to './BRATS2020_results/' creating data loader... creating model and diffusion... training...

| grad_norm | 7.4 | | loss | 1.01 | | loss_cal | 0.232 | | loss_cal_q0 | 0.241 | | loss_cal_q1 | 0.254 | | loss_cal_q2 | 0.225 | | loss_cal_q3 | 0.19 | | loss_diff | 1 | | loss_diff_q0 | 1 | | loss_diff_q1 | 0.997 | | loss_diff_q2 | 1 | | loss_diff_q3 | 0.999 | | loss_q0 | 1.01 | | loss_q1 | 1 | | loss_q2 | 1.01 | | loss_q3 | 1.01 | | param_norm | 233 | | samples | 8 | | step | 0 | | vb | 0.0106 | | vb_q0 | 0.0128 | | vb_q1 | 0.00749 | | vb_q2 | 0.0104 | | vb_q3 | 0.013 |

saving model 0... saving model 0.9999...

| grad_norm | 6.06 | | loss | 0.905 | | loss_cal | 0.131 | | loss_cal_q0 | 0.123 | | loss_cal_q1 | 0.132 | | loss_cal_q2 | 0.137 | | loss_cal_q3 | 0.129 | | loss_diff | 0.89 | | loss_diff_q0 | 0.9 | | loss_diff_q1 | 0.896 | | loss_diff_q2 | 0.898 | | loss_diff_q3 | 0.868 | | loss_q0 | 0.937 | | loss_q1 | 0.903 | | loss_q2 | 0.907 | | loss_q3 | 0.879 | | param_norm | 233 | | samples | 808 | | step | 100 | | vb | 0.0151 | | vb_q0 | 0.0369 | | vb_q1 | 0.00664 | | vb_q2 | 0.00833 | | vb_q3 | 0.0112 |