oceanmodeling / searvey

Sea state observational data retrieval
https://searvey.readthedocs.io/en/stable/
GNU General Public License v3.0
22 stars 11 forks source link

NDBC data #137

Open saeed-moghimi-noaa opened 4 months ago

saeed-moghimi-noaa commented 4 months ago

https://www.ndbc.noaa.gov/ ndbcheader

See an example here: https://github.com/saeed-moghimi-noaa/prep_obs_ca

# Line 250
#coops_ndbc_obs_collector.py

#################
@retry(stop_max_attempt_number=5, wait_fixed=3000)
def get_ndbc(start, end, bbox , sos_name='waves',datum='MSL', verbose=True):
    """
    function to read NBDC data
    ###################
    sos_name = waves    
    all_col = (['station_id', 'sensor_id', 'latitude (degree)', 'longitude (degree)',
           'date_time', 'sea_surface_wave_significant_height (m)',
           'sea_surface_wave_peak_period (s)', 'sea_surface_wave_mean_period (s)',
           'sea_surface_swell_wave_significant_height (m)',
           'sea_surface_swell_wave_period (s)',
           'sea_surface_wind_wave_significant_height (m)',
           'sea_surface_wind_wave_period (s)', 'sea_water_temperature (c)',
           'sea_surface_wave_to_direction (degree)',
           'sea_surface_swell_wave_to_direction (degree)',
           'sea_surface_wind_wave_to_direction (degree)',
           'number_of_frequencies (count)', 'center_frequencies (Hz)',
           'bandwidths (Hz)', 'spectral_energy (m**2/Hz)',
           'mean_wave_direction (degree)', 'principal_wave_direction (degree)',
           'polar_coordinate_r1 (1)', 'polar_coordinate_r2 (1)',
           'calculation_method', 'sampling_rate (Hz)', 'name'])

    sos_name = winds    

    all_col = (['station_id', 'sensor_id', 'latitude (degree)', 'longitude (degree)',
       'date_time', 'depth (m)', 'wind_from_direction (degree)',
       'wind_speed (m/s)', 'wind_speed_of_gust (m/s)',
       'upward_air_velocity (m/s)', 'name'])
saeed-moghimi-noaa commented 4 months ago

See also: https://pypi.org/project/ndbc-api/ https://github.com/cdjellen/ndbc-api

saeed-moghimi-noaa commented 4 months ago

@AliS-Noaa @aliabdolali

What are your preferred web api to download NDBC data?

Thanks

AliS-Noaa commented 4 months ago

Hello Saeed,

Here are two ways I usually get the NDBC data:

https://github.com/NOAA-EMC/WW3-tools/blob/develop/ww3tools/downloadobs/wfetchbuoy.py

also this is a good tool as well:

https://pypi.org/project/NDBC/

Cheers, -------------------------------------------------------- Ali Salimi-Tarazouj, Ph.D. Physical Scientist, Coastal Engineer Lynker at NOAA/NWS/NCEP/EMC 5830 University Research Court College Park, Maryland, 20740 Office: (202) 964-0965 Mobile: (302) 588-5505

On Wed, May 8, 2024 at 10:39 AM Saeed Moghimi @.***> wrote:

@AliS-Noaa https://github.com/AliS-Noaa @aliabdolali https://github.com/aliabdolali

What are your preferred web api to download NDBC data?

Thanks

— Reply to this email directly, view it on GitHub https://github.com/oceanmodeling/searvey/issues/137#issuecomment-2100745649, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4J7D7UFM5GI22CYFH3JHM3ZBI2LBAVCNFSM6AAAAABHNCR5C2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBQG42DKNRUHE . You are receiving this because you were mentioned.Message ID: @.***>

SorooshMani-NOAA commented 4 months ago

From a correspondence with one of our colleagues:

Near-real-time observations from NWS Fixed Buoys and NWS C-MAN Stations and from many ROOA operated buoys and coastal stations are available on the ndbc.noaa.gov web site. I don't know if NDBC has an API yet, but one can obtain their obs via HTTPS or DODS/OPeNDAP https://www.ndbc.noaa.gov/docs/ndbc_web_data_guide.pdf However, I found that someone has written 'ndbc-api' to "parse whitespace-delimited oceanographic and atmospheric data distributed as text files for available time ranges, on a station-by-station basis" (https://pypi.org/project/ndbc-api/). I also found ndbc.py at https://pypi.org/project/NDBC/. I imagine there are many others out there.

SorooshMani-NOAA commented 3 months ago

During our meeting on June 5th we discussed the following items/tasks related to NDBC data:

Todo:

SorooshMani-NOAA commented 3 months ago

Hi @pmav99 today we discussed @abdu558's NDBC implementation. I suggested that he implements everything based on the "new" API (as in #125), but instead of using the _ndbc_api.py as the file name, just use ndbc.py. What do you think?

Also we discussed whether to combine all data into a single dataframe or not and whether to keep the missing value columns, etc. I suggested discussing those in the next group meeting next week.

@abdu558, can you please summarize your questions here as well so that we can discuss them more constructively next week?

SorooshMani-NOAA commented 3 months ago

@abdu558, I forgot to ask, what is the state of conda package for ndbc-api? You said they are open to creating the conda package themselves, right?

abdu558 commented 3 months ago

Yea they did create it and said it would take a few days ish for it to show upScreenshot_20240612_160040_GitHub.jpg

SorooshMani-NOAA commented 3 months ago

Response from NDBC:

[...] We do not have an API though we are hopeful to develop one in the future.

Our FAQs might be a good place to start with your quality control questions: https://www.ndbc.noaa.gov/faq/

tomsail commented 3 months ago

Response from NDBC:

[...] We do not have an API though we are hopeful to develop one in the future. Our FAQs might be a good place to start with your quality control questions: https://www.ndbc.noaa.gov/faq/

Thanks Soroosh. more on QC here: https://www.ndbc.noaa.gov/faq/qc.shtml There is an exhaustive guide on the QC methodology (2009 version) and all the QC flags summarized in APPENDIX E.

abdu558 commented 2 months ago

Hi @pmav99 today we discussed @abdu558's NDBC implementation. I suggested that he implements everything based on the "new" API (as in #125), but instead of using the _ndbc_api.py as the file name, just use ndbc.py. What do you think?

Also we discussed whether to combine all data into a single dataframe or not and whether to keep the missing value columns, etc. I suggested discussing those in the next group meeting next week.

@abdu558, can you please summarize your questions here as well so that we can discuss them more constructively next week?

You answered most of them but one that im not 100% sure of is if when multiple stations:

1) an extra column is added called station id and the data of the different stations are combined to a single data frame

2) outputs a dictionary which maps each id -> a dataframe of the stations data

this is the one that im not 100% sure of

pmav99 commented 2 months ago

@abdu558 different providers return different data. For example, when you try to retrieve data from a bunch of IOC stations you will end up with dataframes with different number of columns and different column names. E.g.

https://www.ioc-sealevelmonitoring.org/bgraph.php?code=aden&output=tab&period=0.5&endtime=2018-06-07 https://www.ioc-sealevelmonitoring.org/bgraph.php?code=abed&output=tab&period=0.5&endtime=2018-06-07

Merging these will result in with a bunch of columns with NaNs. This is problematic because NaNs are floats and consume quite a bit of RAM. If you are retrieving hundreds/thousands of stations for many years this can quickly become problematic

Furthermore, since you can't really know which column will have data for each station, you will end up calling .dropna() for every station id you want to process. Which can also be problematic, because the provider might return NaNs anyhow and you might want to differentiate between those.

Alternatively, you can just avoid merging in the first place. If somebody wants to merge the dictionary it is trivial to do so. E.g.:

data = {
    "st1": pd.DataFrame(index=["2020", "2021"], data={"var1": [111, 222]}),
    "st2": pd.DataFrame(index=["2021", "2022", "2023"], data={"var2": [1, 2, 3], "var3": [0, float("nan"), float("nan")]}),
}
merged = pd.concat(data, names=["station_id", "time"]).reset_index(level=0)

print(data)
print(merged)