BrandonSmithJ / MDN

Mixture Density Network for water constituent estimation
GNU General Public License v3.0
39 stars 34 forks source link

About the data in the paper #3

Closed TGISer closed 4 years ago

TGISer commented 4 years ago

Very nice job!!! I'm excited to see this kind of work. Thank you for your generous code. Such a complete work is very helpful for a beginner like me.

As a beginner like me, it's very difficult to organize data. Can you share the original data in your paper, such as "path/to/my"/tile.nc " and training and test data.

Thanks!

TGISer commented 4 years ago

If it is inconvenient for you to share all the data, can you share a subset of the data for code testing. Thanks!

BrandonSmithJ commented 4 years ago

Unfortunately, we don't have permission from data owners to distribute the in situ set publicly at this point - though we do have plans in the future to compile such a database for release.

As far as the netcdf tiles, you can download them from a few different places. For MSI/OLCI, the easiest would probably be https://scihub.copernicus.eu/dhus/ , which lets you select imagery by time / location, as well as by a few other parameters. Hope this helps!

TGISer commented 4 years ago

Looking forward to your new work on dataset!

Can you give me an example of a sample (not real data is ok), about how to organize and store training data and test data? I only need an example. I will try to build a test dataset myself.

Thanks!

BrandonSmithJ commented 4 years ago

Sure, an example would be something like:

from MDN import image_estimates, get_sensor_bands
sensor = "<S2A, S2B, or OLCI>"
data = [np.random.rand(3,3) for band in get_sensor_bands(sensor)]

estimates = image_estimates(*data, sensor=sensor)
chlor_a = estimates[0]

As for organization / storing, I have data stored in a structure like the following: Data_Folder/Test_Data/Dataset_Location/Sensor/Rrs.csv Data_Folder/Test_Data/Dataset_Location/chl.csv

"Dataset_Location" keeps datasets from different contributors separated, "Sensor" contains the spectral data for instruments (MSI, OLCI, ...).

TGISer commented 4 years ago

Get it! Thank you for your help!

megaocean commented 4 years ago

Hi! This code is so great, thank you! I am new to machine learning and this is really helping me write my own model for my research :) I apologize in advance if this is a stupid question~ bear with me, I'm still trying to understand the model. I tried running the SOLID model myself for TSS retrieval but I keep getting the output "No trained model exists:" and I'm not sure what I'm missing or doing wrong? I already have a .csv file with my Rrs data.

Thanks!

BrandonSmithJ commented 4 years ago

Glad you've found it useful!

It's a good question: the code is still being modified a fair bit as we expand its coverage, and SOLID was dependent upon a prior version of the MDN code. So that's my fault for not ensuring the backwards compatibility; apologies for the issue. I've now pushed a new code version to the repo, containing a number of bug fixes as well as a few feature additions. Running SOLID should now work as expected.

Note that the general API has a minor change if using this updated code (@TGISer ): Rrs data coming from the "get_tile_data" function should no longer be unpacked when passing it into the "image_estimates" function. This updated functionality is shown in the README.

I'll be closing this issue as everything should be resolved here, but please do open new issues if you run into any trouble.