CherBass / ICAM

ICAM: Interpretable Classification via Disentangled Representations and Feature Attribution Mapping
MIT License
54 stars 10 forks source link

How to change a default batch size? the default batch size = 2. #1

Open Umair6977 opened 2 years ago

Umair6977 commented 2 years ago

I am waiting for a response. Thanks

CherBass commented 2 years ago

If you go into options.py you can change batch_size param to some other even number. You could also start training using this command in the terminal to change the batch_size: python train.py --batch_size 4 Please note that all experiments were done using batch_size=2 and larger batch sizes were not tested. Also, make sure your branch is up-to-date as there is a small bug fix that is necessary.

Umair6977 commented 2 years ago

Thanks for your response, (everything rum smoothly with default batch_size= 2) but I already tried options.py and then got an error, (images_a[0:1, ::], mask_a[0:1, ::], images_a1[0:1, ::], images_a2[0:1, ::], images_a3[0:1, ::], images_a4[0:1, ::], images_a5[0:1, ::]), 3) RuntimeError: Sizes of tensors must match except in dimension 1. Got 1 and 32 (The offending index is 1)

is it directly related to batch_size, need your suggestions,

Thanks

CherBass commented 2 years ago

As I've mentioned the fix for this is already in master so you need to pull the latest branch

lines 563-664 in model.py:

            mask_a = (self.mask_a.unsqueeze(1)).detach()
            mask_b = (self.mask_b.unsqueeze(1)).detach()
Umair6977 commented 2 years ago

The batch_size has been changed now but there is another value error now occurred

ValueError: Found input variables with inconsistent numbers of samples: [674, 676]

need your suggestions.!

Thanks

CherBass commented 2 years ago

Can you send me the full error message, alongside your param config? (i.e. options.py params)

Umair6977 commented 2 years ago

Thanks a lot! it has been resolved, if I found another bug I will let you know.

stay blessed

Umair6977 commented 2 years ago

Hi, what's the function of these code lines in (train.py)

example validation

    # try:
    #     _validation(opts, model, healthy_val_dataloader, anomaly_val_dataloader)
    # except Exception as e:
    #     print(f'Encountered error during validation - {e}')
    #     raise e

example test

try:
    _test(opts, model, healthy_test_dataloader, anomaly_test_dataloader)
except Exception as e:
    print(f'Encountered error during validation - {e}')
    raise e

save last model

saver.write_model(ep, total_it, iter_counter, model, model_name='model_last')
saver.write_img(ep, total_it, model)

return

i found this error

Encountered error during validation - Found input variables with inconsistent numbers of samples: [674, 676] Traceback (most recent call last): File "train.py", line 635, in main() File "train.py", line 178, in main raise e File "train.py", line 175, in main _test(opts, model, healthy_test_dataloader, anomaly_test_dataloader) File "train.py", line 419, in _test val_accuracy = accuracy_score(val_pred_temp, val_labels) File "/home/data/Umair/.conda/envs/Umair_pytorch/lib/python3.6/site-packages/sklearn/metrics/_classification.py", line 185, in accuracy_score y_type, y_true, y_pred = _check_targets(y_true, y_pred) File "/home/data/Umair/.conda/envs/Umair_pytorch/lib/python3.6/site-packages/sklearn/metrics/_classification.py", line 80, in _check_targets check_consistent_length(y_true, y_pred) File "/home/data/Umair/.conda/envs/Umair_pytorch/lib/python3.6/site-packages/sklearn/utils/validation.py", line 212, in check_consistent_length " samples: %r" % [int(l) for l in lengths]) ValueError: Found input variables with inconsistent numbers of samples: [674, 676]

Thanks.!

CherBass commented 2 years ago

validation >> validation function to test on the validation dataset - it's meant to compute classification scores and do some example translations (between class 0 to 1, and to create a feature attribution map). This is done at the end of every epoch.

test>> testing function to test on the test dataset - it's meant to compute classification scores and do some example translations (between class 0 to 1, and to create a feature attribution map). This is done once at the end of training.

save last model >> saves a checkpoint of last model + does some plotting

Sorry but I'm not sure how you got the error. I can't replicate it even when batch_size=4. Also, please note you're not meant to change the param val_batch_size as it could lead to errors and thus those functions might not work.

Umair6977 commented 2 years ago

yes, I got it. I have run it with all params. but there is no Test accuracy and some other matrices have been shown graphically, but before a JSON file and some graph has been established such as val_accuracy, val_f1, val_pre, val_recall. PLZ, can you guide me on how to show graphically and generate a JSON file to measure the test accuracy for all previously defined matrices for the Test set? Thanks!

CherBass commented 2 years ago

Test metrics should already be saved at the end (after code finished running) in test_results.json.

Since these are just single values, there is no point to plot them graphically. In validation we plot the metrics across epochs to see how the training is going.

Umair6977 commented 2 years ago

There is only parameters.json file has been saved, NOT test_results.json.

CherBass commented 2 years ago

Most likely that means that _test() function hasn't finished running successfully so you will have to debug why it didn't run

Umair6977 commented 2 years ago

Maybe there should be something missing but I think the test() function also has run successfully. you can see this. WeChat Screenshot_20220605202557

Umair6977 commented 2 years ago

As an expert, if you think is there anything missing plz give some valuable suggestions to make it accurate. It's the whole overview of the process running. WeChat Screenshot_20220605205926

Bundle of Thanks!