Open saeed-moghimi-noaa opened 4 months ago
@AliS-Noaa @aliabdolali
What are your preferred web api to download NDBC data?
Thanks
Hello Saeed,
Here are two ways I usually get the NDBC data:
https://github.com/NOAA-EMC/WW3-tools/blob/develop/ww3tools/downloadobs/wfetchbuoy.py
also this is a good tool as well:
https://pypi.org/project/NDBC/
Cheers, -------------------------------------------------------- Ali Salimi-Tarazouj, Ph.D. Physical Scientist, Coastal Engineer Lynker at NOAA/NWS/NCEP/EMC 5830 University Research Court College Park, Maryland, 20740 Office: (202) 964-0965 Mobile: (302) 588-5505
On Wed, May 8, 2024 at 10:39 AM Saeed Moghimi @.***> wrote:
@AliS-Noaa https://github.com/AliS-Noaa @aliabdolali https://github.com/aliabdolali
What are your preferred web api to download NDBC data?
Thanks
— Reply to this email directly, view it on GitHub https://github.com/oceanmodeling/searvey/issues/137#issuecomment-2100745649, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4J7D7UFM5GI22CYFH3JHM3ZBI2LBAVCNFSM6AAAAABHNCR5C2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBQG42DKNRUHE . You are receiving this because you were mentioned.Message ID: @.***>
From a correspondence with one of our colleagues:
Near-real-time observations from NWS Fixed Buoys and NWS C-MAN Stations and from many ROOA operated buoys and coastal stations are available on the ndbc.noaa.gov web site. I don't know if NDBC has an API yet, but one can obtain their obs via HTTPS or DODS/OPeNDAP https://www.ndbc.noaa.gov/docs/ndbc_web_data_guide.pdf However, I found that someone has written 'ndbc-api' to "parse whitespace-delimited oceanographic and atmospheric data distributed as text files for available time ranges, on a station-by-station basis" (https://pypi.org/project/ndbc-api/). I also found ndbc.py at https://pypi.org/project/NDBC/. I imagine there are many others out there.
During our meeting on June 5th we discussed the following items/tasks related to NDBC data:
ndbc-api
pacakge, an alternative package or write from sctach
ndbc-api
is MIT if we end up using itTodo:
Hi @pmav99 today we discussed @abdu558's NDBC implementation. I suggested that he implements everything based on the "new" API (as in #125), but instead of using the _ndbc_api.py
as the file name, just use ndbc.py
. What do you think?
Also we discussed whether to combine all data into a single dataframe or not and whether to keep the missing value columns, etc. I suggested discussing those in the next group meeting next week.
@abdu558, can you please summarize your questions here as well so that we can discuss them more constructively next week?
@abdu558, I forgot to ask, what is the state of conda package for ndbc-api
? You said they are open to creating the conda package themselves, right?
Yea they did create it and said it would take a few days ish for it to show up
Response from NDBC:
[...] We do not have an API though we are hopeful to develop one in the future.
Our FAQs might be a good place to start with your quality control questions: https://www.ndbc.noaa.gov/faq/
Response from NDBC:
[...] We do not have an API though we are hopeful to develop one in the future. Our FAQs might be a good place to start with your quality control questions: https://www.ndbc.noaa.gov/faq/
Thanks Soroosh. more on QC here: https://www.ndbc.noaa.gov/faq/qc.shtml There is an exhaustive guide on the QC methodology (2009 version) and all the QC flags summarized in APPENDIX E.
Hi @pmav99 today we discussed @abdu558's NDBC implementation. I suggested that he implements everything based on the "new" API (as in #125), but instead of using the
_ndbc_api.py
as the file name, just usendbc.py
. What do you think?Also we discussed whether to combine all data into a single dataframe or not and whether to keep the missing value columns, etc. I suggested discussing those in the next group meeting next week.
@abdu558, can you please summarize your questions here as well so that we can discuss them more constructively next week?
You answered most of them but one that im not 100% sure of is if when multiple stations:
1) an extra column is added called station id and the data of the different stations are combined to a single data frame
2) outputs a dictionary which maps each id -> a dataframe of the stations data
this is the one that im not 100% sure of
@abdu558 different providers return different data. For example, when you try to retrieve data from a bunch of IOC stations you will end up with dataframes with different number of columns and different column names. E.g.
https://www.ioc-sealevelmonitoring.org/bgraph.php?code=aden&output=tab&period=0.5&endtime=2018-06-07 https://www.ioc-sealevelmonitoring.org/bgraph.php?code=abed&output=tab&period=0.5&endtime=2018-06-07
Merging these will result in with a bunch of columns with NaNs. This is problematic because NaNs are floats and consume quite a bit of RAM. If you are retrieving hundreds/thousands of stations for many years this can quickly become problematic
Furthermore, since you can't really know which column will have data for each station, you will end up calling .dropna()
for every station id you want to process. Which can also be problematic, because the provider might return NaNs anyhow and you might want to differentiate between those.
Alternatively, you can just avoid merging in the first place. If somebody wants to merge the dictionary it is trivial to do so. E.g.:
data = {
"st1": pd.DataFrame(index=["2020", "2021"], data={"var1": [111, 222]}),
"st2": pd.DataFrame(index=["2021", "2022", "2023"], data={"var2": [1, 2, 3], "var3": [0, float("nan"), float("nan")]}),
}
merged = pd.concat(data, names=["station_id", "time"]).reset_index(level=0)
print(data)
print(merged)
https://www.ndbc.noaa.gov/
See an example here: https://github.com/saeed-moghimi-noaa/prep_obs_ca