CosmoStat / autometacal

Metacalibration and shape measurement by automatic differentiation
MIT License
4 stars 1 forks source link

Refactor TF Dataset #46

Open andrevitorelli opened 2 years ago

andrevitorelli commented 2 years ago

After talking with @EiffL and @b-remy, we decided to converge on a more usable format for our tfds.

We will have a simple galaxy generator dataset and one based on parametric COSMOS galaxies drawn as CFIS-like stamp images.

This issue is to discuss some design choices and track/discuss this development.

Issues/todo:

What do you think about these open questions, @EiffL ?

EiffL commented 2 years ago

Yep :-) but I'm a bit confused, I thought we had already done all of that. I remember discussions like here: https://github.com/CosmoStat/autometacal/issues/39

EiffL commented 2 years ago

Otherwise I want to say no to knots for now, and we will need to keep fairly faint galaxies in order to apply the selection cuts.

EiffL commented 2 years ago

And the only form of augmentation we can make is generating the noise on the fly instead of hardcoded in the dataset

andrevitorelli commented 2 years ago

Yep :-) but I'm a bit confused, I thought we had already done all of that. I remember discussions like here: #39

We did, but the current state of the module is not satisfactory, still.

And the only form of augmentation we can make is generating the noise on the fly instead of hardcoded in the dataset

This is kind of a problem. The COSMOS dataset is small, isn't it? We need more than 80k galaxy models.

EiffL commented 2 years ago

No it's not really a problem, you can apply random rotations to galaxies, and generate independent noise realisations. At least for training a NN it's more than enough

andrevitorelli commented 2 years ago

Ok, that's kind of the data aug I was talking about.

EiffL commented 2 years ago

Yep but this can be done on the fly most likely.