Create procedure for creating `xmm_sim_dataset`

bojobo commented 1 year ago

Use xmm-epic-pn-simulator. If possible, we could add some randomness to the creation process. We could even check if we can generate images on the fly.

SamSweere commented 1 year ago

Creating the images takes quite a bit of compute. The current simulated dataset was created using a compute cluster having more than 100 cpu's. I calculated that it would have taken about 3 months to simulate these images using my own PC. Regarding the randomness, creating the simulated images has already quite a bit of randomness in it. However, for reproducibility we a static dataset is preferable. However, having a more diverse simulated dataset would definitively be interesting! Maybe we could even cluster them in classes (types of objects that are being observed) and see which classes perform better/worse than other. We could then generate more training samples for these classes.

bojobo commented 1 year ago

Oh damn, that's quite some time. Does is take that long even with multiprocessing? I'll see if there are some optimisations possible.

I'm not quite sure about the classes. Super-resolution should work regardless of the types of objects observed, i.e. all classes should be represented with the same amount of images. Otherwise the model could tend to "hallucinate" object which aren't there.

Oh, and I'll probably move this issue too to the xmm-epicpn-simulator repo (same as SamSweere/xmm-epicpn-simulator#27)

bojobo commented 1 month ago

Use https://github.com/SamSweere/xmm-epicpn-simulator for creation of simulated dataset

SamSweere / xmm-superres-denoise

Create procedure for creating `xmm_sim_dataset` #16