Error on the fifth shard

cleverhans-lab / machine-unlearning

MIT License

153 stars 34 forks source link

Error on the fifth shard #3

Open clemley opened 3 years ago

clemley commented 3 years ago

On the first request of the fifth shard I believe there is an index error as it causes an error. All other pieces run properly aside from the fifth shard. Is there a way to fix this?

huxi2 commented 3 years ago

I found that the number of data in purchase2_train.npy generated by running init.sh was 249215, which was different from the number in the datasetfile. So I fix this by modifying this code in prepare_data.py : X_train, X_test, y_train, y_test = train_test_split(data, label, **test_size=0.1**)

Hope that helps

swagStar123-code commented 1 year ago

According to the proposal, the change from 0.2 to 0.1 still has the above problems.

KatieHYT commented 1 year ago

Same here. Even I change from 0.2 to 0.1, there is still the index error IndexError: index 280367 is out of bounds for axis 0 with size 280367.

any suggestion till now?

nimeshagrawal commented 11 months ago

Any solution found regarding this issue?

nimeshagrawal commented 11 months ago

The problem is there in the datasets/purchase/datasetfile. They have hard coded train and test sample size. The prepare_data.py splits according to test_size = 0.2, but "datasetfile" has sample sizes according to test_size = 0.1. Hence, change train & test sample size in "datasetfile". (Replace with nb_train = 249215 and nb_test = 62304)

scottshufe commented 10 months ago

Thanks for your solution. It solved my problem perfectly.

The problem is there in the datasets/purchase/datasetfile. They have hard coded train and test sample size. The prepare_data.py splits according to test_size = 0.2, but "datasetfile" has sample sizes according to test_size = 0.1. Hence, change train & test sample size in "datasetfile". (Replace with nb_train = 249215 and nb_test = 62304)

GM-git-dotcom commented 8 months ago

The problem is there in the datasets/purchase/datasetfile. They have hard coded train and test sample size. The prepare_data.py splits according to test_size = 0.2, but "datasetfile" has sample sizes according to test_size = 0.1. Hence, change train & test sample size in "datasetfile". (Replace with nb_train = 249215 and nb_test = 62304)

This. And remember to run python prepare_data.py after making this change.