Open mo-shahab opened 4 months ago
after reducing the number of requests and running the program without anything open (to give as much as memory possible for the execution) it slightly works but, this is encountered
(venv) shahab in C:\dev\vershachi-unlearning\examples on main ● ~1 λ python .\example_sisa_train.py
shard: 1/4, requests: 1/5
Previous checkpoint removed: containers/default/cache/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_0.pt
Previous time file removed: containers/default/times/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_0.time
Previous checkpoint removed: containers/default/cache/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_1.pt
Previous time file removed: containers/default/times/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_1.time
Previous checkpoint removed: containers/default/cache/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_2.pt
Previous time file removed: containers/default/times/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_2.time
Previous checkpoint removed: containers/default/cache/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_3.pt
Previous time file removed: containers/default/times/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_3.time
shard: 1/4, requests: 2/5
Previous checkpoint removed: containers/default/cache/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_0.pt
Previous time file removed: containers/default/times/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_0.time
Previous checkpoint removed: containers/default/cache/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_1.pt
Previous time file removed: containers/default/times/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_1.time
Previous checkpoint removed: containers/default/cache/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_2.pt
Previous time file removed: containers/default/times/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_2.time
Previous checkpoint removed: containers/default/cache/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_3.pt
Previous time file removed: containers/default/times/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_3.time
shard: 1/4, requests: 3/5
Previous checkpoint removed: containers/default/cache/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_0.pt
Previous time file removed: containers/default/times/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_0.time
Previous checkpoint removed: containers/default/cache/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_1.pt
Previous time file removed: containers/default/times/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_1.time
Previous checkpoint removed: containers/default/cache/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_2.pt
Previous time file removed: containers/default/times/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_2.time
Previous checkpoint removed: containers/default/cache/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_3.pt
Previous time file removed: containers/default/times/1f99f38a56d49f1a56feff3ea0e749f6c0030e106111e83b5f79a1988da0ac4c_3.time
shard: 1/4, requests: 4/5
Previous checkpoint removed: containers/default/cache/04f1b558d065f8157e14e96376563fffe263bdc76ba1474992fa7cc4b6119785_0.pt
Previous time file removed: containers/default/times/04f1b558d065f8157e14e96376563fffe263bdc76ba1474992fa7cc4b6119785_0.time
Previous checkpoint removed: containers/default/cache/04f1b558d065f8157e14e96376563fffe263bdc76ba1474992fa7cc4b6119785_1.pt
Previous time file removed: containers/default/times/04f1b558d065f8157e14e96376563fffe263bdc76ba1474992fa7cc4b6119785_1.time
Previous checkpoint removed: containers/default/cache/04f1b558d065f8157e14e96376563fffe263bdc76ba1474992fa7cc4b6119785_2.pt
Previous time file removed: containers/default/times/04f1b558d065f8157e14e96376563fffe263bdc76ba1474992fa7cc4b6119785_2.time
Previous checkpoint removed: containers/default/cache/04f1b558d065f8157e14e96376563fffe263bdc76ba1474992fa7cc4b6119785_3.pt
Previous time file removed: containers/default/times/04f1b558d065f8157e14e96376563fffe263bdc76ba1474992fa7cc4b6119785_3.time
shard: 1/4, requests: 5/5
Previous checkpoint removed: containers/default/cache/c0e8001bc8d0bd2801677e8ff143e5a32acfacc0c2da10e0b126bc5ed0c74f17_0.pt
Previous time file removed: containers/default/times/c0e8001bc8d0bd2801677e8ff143e5a32acfacc0c2da10e0b126bc5ed0c74f17_0.time
Previous checkpoint removed: containers/default/cache/c0e8001bc8d0bd2801677e8ff143e5a32acfacc0c2da10e0b126bc5ed0c74f17_1.pt
Previous time file removed: containers/default/times/c0e8001bc8d0bd2801677e8ff143e5a32acfacc0c2da10e0b126bc5ed0c74f17_1.time
Previous checkpoint removed: containers/default/cache/c0e8001bc8d0bd2801677e8ff143e5a32acfacc0c2da10e0b126bc5ed0c74f17_2.pt
Previous time file removed: containers/default/times/c0e8001bc8d0bd2801677e8ff143e5a32acfacc0c2da10e0b126bc5ed0c74f17_2.time
Previous checkpoint removed: containers/default/cache/c0e8001bc8d0bd2801677e8ff143e5a32acfacc0c2da10e0b126bc5ed0c74f17_3.pt
Previous time file removed: containers/default/times/c0e8001bc8d0bd2801677e8ff143e5a32acfacc0c2da10e0b126bc5ed0c74f17_3.time
shard: 2/4, requests: 1/5
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_0.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_0.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_1.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_1.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_2.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_2.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_3.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_3.time
shard: 2/4, requests: 2/5
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_0.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_0.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_1.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_1.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_2.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_2.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_3.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_3.time
shard: 2/4, requests: 3/5
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_0.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_0.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_1.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_1.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_2.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_2.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_3.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_3.time
shard: 2/4, requests: 4/5
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_0.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_0.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_1.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_1.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_2.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_2.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_3.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_3.time
shard: 2/4, requests: 5/5
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_0.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_0.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_1.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_1.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_2.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_2.time
Previous checkpoint removed: containers/default/cache/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_3.pt
Previous time file removed: containers/default/times/fea2ec41a7ba4d3afa03c015d949b82be37fbf7dc0ab3c36a9533dd3a31847ee_3.time
shard: 3/4, requests: 1/5
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_0.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_0.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_1.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_1.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_2.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_2.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_3.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_3.time
shard: 3/4, requests: 2/5
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_0.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_0.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_1.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_1.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_2.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_2.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_3.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_3.time
shard: 3/4, requests: 3/5
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_0.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_0.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_1.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_1.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_2.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_2.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_3.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_3.time
shard: 3/4, requests: 4/5
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_0.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_0.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_1.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_1.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_2.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_2.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_3.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_3.time
shard: 3/4, requests: 5/5
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_0.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_0.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_1.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_1.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_2.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_2.time
Previous checkpoint removed: containers/default/cache/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_3.pt
Previous time file removed: containers/default/times/45812b7774ae4b83f9ba74fdef06750da70440e09c3be34412c21b0eabeedc5f_3.time
shard: 4/4, requests: 1/5
Traceback (most recent call last):
File "C:\dev\vershachi-unlearning\examples\example_sisa_train.py", line 27, in <module>
trainer._train()
File "C:\dev\vershachi-unlearning\vershachi\sisa\sisa.py", line 159, in _train
for images, labels in fetchShardBatch(
File "C:\dev\vershachi-unlearning\vershachi\sisa\sharded.py", line 107, in fetchShardBatch
yield dataloader_module.load(indices)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\vershachi-unlearning\datasets\dataloader.py", line 21, in load
return X_train[indices], y_train[indices]
~~~~~~~^^^^^^^^^
IndexError: index 249215 is out of bounds for axis 0 with size 249215
it says how index is out of range, it does generate the .pt
checkpoint files, which are the checkpoints created by pytorch. it can be viewed as the model being trained, also reduced the number of epochs
which slightly may cause a problem in future of the development because of insufficient training of the model
should probably reduce the number of the
requests
and the number ofepochs
, currently,requests=16
andepochs=5
and also, should work on batch loading of the data to the model in the training phase.