Closed belerico closed 5 months ago
thanks. I'm guessing the buffer is perhaps not the same as the membuffer? seems like its an action history buffer or something like that. However, it does seem to resume strangely when having this enabled, it ends up using the memory buffer from the previous run (although these membuffer files never seems to be updated, so I'm not really sure what its for), and the first resumption seems to work fine, however, upon the second cancellation, the membuffer files from the previous run ( the ones that are being used ) look like they get deleted, and so when trying to resume a second time, there are no membuffer files and it results in an error. Its also a bit confusing as it seems like what you end up having is like
In the case where you have buffer checkpoint False, each run just creates its own new membuffer again and then does the pretraining to fill up the action buffer or whatever it is, and everything works fine.
Hi @Disastorm, I will try to provide some clarity:
.memmap
files in the memmap_buffer folder).1.
does not create any memmap file because the buffer instantiated from the checkpoint references the files in the first run memmap_buffer
folder.2.
). The buffer stored in the checkpoint (from the second run) recursively references the memmap in the log directory of the first run.
The buffer of the second run is saved in the checkpoint, but it references the files of the first run.
No other memmap files are created when resuming from a checkpoint (if buffer.checkpoint=True
).This means that the memmap_buffer
folder of the first run must NOT be deleted.
I also tried to test it and it works (on Linux), I can restart an experiment multiple times:
1.
2.
3.
This is the result of the test I carried out.
In particular, what I have done is the following:
python sheeprl.py exp=dreamer_v3_100k_ms_pacman checkpoint.every=100
for the first run.python sheeprl.py exp=dreamer_v3_100k_ms_pacman checkpoint.every=100 checkpoint.resume_from=/path/to/first/run/checkpoint.ckpt
(you must include .ckpt
file in the path).I understand that the logic of resume_from_checkpoint
is convoluted and can be a bit confusing, sorry for that. I hope it is now clearer.
We will try to make some changes to make this process clearer.
Thanks yea thats basically the same behavior I saw, its just somehow there was something that was triggering the auto-delete of the memmap files from the first run. I don't really know what was triggering it but it seemed to potentially be related to when I was ctrl-c the run, not sure if i messed up a setting somewhere, or if its related to running on windows, etc. Anyway, I'm just using buffer.checkpoint = False for now so you can close this issue, but just wanted to mention that there may be some trigger somewhere that auto-deletes the memmap files from the first run when you do ctrl-c on one of the later runs.
Hey @Disastorm, we have indeed a problem with memmapped arrays on Windows: can you try out this branch pls?
Oh I see thanks. I'll try that out for whenever I try to train a new model, for the one I'm currently running I already have buffer.checkpoint = False so I can't try it on this one. or are you saying that actually even when checkpoint = False, the memmap is not working properly and I should use that branch regardless?
btw separate question, is there a way to set exploration in dreamerv3? would i adjust ent_coef, or do i need to use one of those other things like the Plan2Explore configs ( I don't know what Plan2Explore is ).
Oh I see thanks. I'll try that out for whenever I try to train a new model, for the one I'm currently running I already have buffer.checkpoint = False so I can't try it on this one. or are you saying that actually even when checkpoint = False, the memmap is not working properly and I should use that branch regardless?
Nope, the memmap is working properly, the problem arise when you checkpoint the buffer and try to resume multiple times, in that particular case the memmap buffer on Windows will be deleted. If you can try that new branch so we are sure it fixes your problem, then we can close the issue
btw separate question, is there a way to set exploration in dreamerv3? would i adjust ent_coef, or do i need to use one of those other things like the Plan2Explore configs ( I don't know what Plan2Explore is ).
I will open a new issue with the question to keep things in order
confirmed this is fixed.
Hi @Disastorm, I've copied here your new question, so that we keep closed the other issue:
@belerico Hey just wondering how this buffer checkpointing works? I have
And so when resuming it doesn't do the pretraining buffer steps anymore, however I noticed the buffer files don't ever get updated, the last modified date is just when the first training started. Is this a problem? The files I'm referring to are the .memmap files, I see now it doesn't keep creating them for each run when checkpoint = True, so I assumed it would be using the ones from the previous run, but their update date isn't changing at all. Is it inside the checkpoint file itself? The filesize of the checkpoint still looks pretty similar to when running with checkpoint: False I think.