ldeo-glaciology / LEAP-Cryo-planning

A repo for planning and tracking progress on the LEAP-Cryo project: Learning ice-sheet flow with physics-based and machine learning models.
2 stars 2 forks source link

Dealing with NaN values in data #23

Open Templar129 opened 1 year ago

Templar129 commented 1 year ago

Our topography data inevitably has a lot of NaN values. Some of them are clustering so it would be hard for us to use regular interpolation methods to fill the NaN. Andrew talked about a package that could be very useful, GStatSim. It uses a few other pakages from scipy and sklearn to fill NaN values especially in Geological data. Here is the Github: https://gatorglaciology.github.io/gstatsimbook/3_Simple_kriging_and_ordinary_kriging.html

I have choose one of the image we have and tried the package, and I think it would be pretty useful. Right now it works on a 512 x 512 pixel image, but we can apply it to the larger original image. After we fill all the NaN values, we can use these data in the VAE. The VAE works much better without the NaN, and it also help us decrease the pattern it learns from the NaN which is not what we wanted.

Here is the example of how this package help to fill the NaN.

Screenshot 2023-09-15 at 2 24 21 PM Screenshot 2023-09-15 at 2 24 12 PM