About the train.py - Githubissues

chunbaobao / Deep-JSCC-PyTorch

A implement of Deep JSCC for wireless image transmission by PyTorch

45 stars 8 forks source link

About the train.py #2

Closed Qijian-Z closed 4 months ago

Qijian-Z commented 8 months ago

Hello, let me ask the question again. When I run the train.py, I got an error message called"RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor". And I tried to fix it, but couldn't.

chunbaobao commented 8 months ago

Alright, I just realized that I forgot to transfer the input tensor of test dataloader to GPU when the parallel is enable which can lead to the different devices error. The new implementation has been committed. By the way, if u only have one or no GPU, u can disable the parallel parameter.

Qijian-Z commented 8 months ago

Thanks a lot!

Qijian-Z commented 8 months ago

Sorry, I have one more question. I tried to run the train.py using ImageNet Dataset, it stopped here. Training Start loading data of imagenet Namespace(seed=2048, lr=0.001, epochs=300, batch_size=32, weight_decay=0.0005, channel='AWGN', saved='./saved', snr_list=[19.0, 13.0, 7.0, 4.0, 1.0], ratio_list=[0.3333333333333333, 0.16666666666666666, 0.08333333333333333], num_workers=4, dataset='imagenet', parallel=True, if_scheduler=False, step_size=640, device='cuda:0', gamma=0.5, disable_tqdm=True) the inner channel is 19 What should I do?

chunbaobao commented 8 months ago

Actually, the ImageNet dataset is quite extensive, so it appears that the model has been initiated, but the first epoch has not been completed yet. u can modify the disable_tqdm option in the arguments to enable the visualization of the training progress.

Qijian-Z commented 7 months ago

I would like to know about the results, but after evaluating the model, I feel it is not performing well. I would like to ask, did you actually run the code and get results similar to the paper?

chunbaobao commented 7 months ago

could you show me your details of parameter and the results? @Qijian-Z

Qijian-Z commented 6 months ago

Alright, I used the model named " imagenet_100_0.33_100.00_32_19.pth ", snr = 20, ratio = 1/3, got the psnr = 13.01. I may have misunderstood, doesn't 100.00 mean you trained under snr = 100?

Luckwjf commented 6 months ago

The project you shared is very good and useful. Thank you very much for your efforts. We look forward to more project sharing in semantic communication.

chunbaobao commented 6 months ago

Alright, I used the model named " imagenet_100_0.33_100.00_32_19.pth ", snr = 20, ratio = 1/3, got the psnr = 13.01. I may have misunderstood, doesn't 100.00 mean you trained under snr = 100?

Yes, it is. The repository is simply for a sanity check, so the visualization.py file has been excluded.

chunbaobao commented 6 months ago

The project you shared is very good and useful. Thank you very much for your efforts. We look forward to more project sharing in semantic communication.

Thanks u so much, I will keep working to share.

Qijian-Z commented 6 months ago

Thank u for your reply. I'm so sorry, could u provide us the visualization.py file?

chunbaobao commented 6 months ago

Thank u for your reply. I'm so sorry, could u provide us the visualization.py file?

sorry, im busy resently. The visualization of the results hasn't been written yet, but it is on my to-do list.

Qijian-Z commented 6 months ago

Thank u. please tell me if you wrote it. Respect.

chunbaobao commented 4 months ago

Updated. The results of cifar10 dataset in AWGN channel are provided in README.md/results.