alexander-petkov / wfas

A placeholder for the WFAS project.
5 stars 1 forks source link

RTMA archive retrieval #10

Open alexander-petkov opened 5 years ago

alexander-petkov commented 5 years ago

Configure an RTMA 14-day archive retrieval. The data is to be synced once per day, and local files kept and not overwritten.

alexander-petkov commented 5 years ago

I think the current command in crontab downloads the entire archive, but due to the nd option, it overwrites files, only saving locally the last day of data:

wget -nd --recursive --directory-prefix=somedir  -N  --no-parent -R*_nobiasc* -Artma2p5.*.2dvaranl_ndfd.grb2_wexp ftp://ftp.ncep.noaa.gov/pub/data/nccf/com/rtma/prod
alexander-petkov commented 5 years ago

I tried lftp as an alternative to wget:

sudo lftp -f "
open ftp://ftp.ncep.noaa.gov
lcd somelocaldir
mirror --continue --delete --verbose --only-missing --no-empty-dirs -Irtma2p5.*/rtma2p5.*.2dvaranl_ndfd.grb2_wexp /pub/data/nccf/com/rtma/prod/ .
bye "

Would it also remove locally created files (cached data from the ImageMosaic plugin)? This needs to be tested.

wmjolly commented 5 years ago

I have another script that doesn't remove the data that I can share as well. I've never used lftp. Does it work well?

wmjolly commented 5 years ago

I think the current command in crontab downloads the entire archive, but due to the nd option, it overwrites files, only saving locally the last day of data:

wget -nd --recursive --directory-prefix=somedir  -N  --no-parent -R*_nobiasc* -Artma2p5.*.2dvaranl_ndfd.grb2_wexp ftp://ftp.ncep.noaa.gov/pub/data/nccf/com/rtma/prod

I was really only using the last 24 hours of data for my other process, so I didn't need to keep the old data. We could consider mirroring the current RTMA archive instead. We'd have to deal with different data directories but it would be clean and manageable. However, I extracted the files twice a day, so it built up the archive anyway, it just wasn't robust to download errors.

wmjolly commented 5 years ago

I tried lftp as an alternative to wget:

sudo lftp -f "
open ftp://ftp.ncep.noaa.gov
lcd somelocaldir
mirror --continue --delete --verbose --only-missing --no-empty-dirs -Irtma2p5.*/rtma2p5.*.2dvaranl_ndfd.grb2_wexp /pub/data/nccf/com/rtma/prod/ .
bye "

Would it also remove locally created files (cached data from the ImageMosaic plugin)? This needs to be tested. Yes, I think the process would need to update the RTMA data and clean the cached ImageMosaic files but the ImageMosaic files could be cleaned or updated less frequently if it is time consuming. Maybe every 3 hours? If it's fast, we should sync the RTMA archive with the ImageMosaic as often as time will allow.

alexander-petkov commented 5 years ago

I have another script that doesn't remove the data that I can share as well. I've never used lftp. Does it work well?

Me neither--it seems to do what I want in a not very messy way. Removes directories and files that are not on the remote server anymore.

alexander-petkov commented 5 years ago

Yes, I think the process would need to update the RTMA data and clean the cached ImageMosaic files but the ImageMosaic files could be cleaned or updated less frequently if it is time consuming. Maybe every 3 hours? If it's fast, we should sync the RTMA archive with the ImageMosaic as often as time will allow.

Ok, I will need to do some testing.

alexander-petkov commented 5 years ago

When mirroring the RTMA archive, LFTP is removing the cache files generated by the ImageMosaic plugin. This kills the mosaic coverage...

This wget request works pretty well:

wget --cut-dirs 6 -xnH -c --recursive --directory-prefix=somedir \ 
         -N  --no-parent -Artma2p5.*.2dvaranl_ndfd.grb2_wexp  \
         ftp://ftp.ncep.noaa.gov/pub/data/nccf/com/rtma/prod/rtma2p5.*

Older than 14 days rtma2p5.* directories can be cleaned up with find:

find  somedir -name 'rtma2p5.*' -type d -ctime +14 -exec rm -rf {} \;
alexander-petkov commented 5 years ago

I have an initial script for syncing RTMA archive to ftp location, and updating the Imagemosaic coveragestore.

It is currently written as a bash script, which isn't exactly elegant and easily human readable, but does not require additional software.

It is scheduled via cron to run 10 minutes past every 3rd hour.

alexander-petkov commented 5 years ago

This is a subtask to #7

alexander-petkov commented 5 years ago

These are the variables I have currently configured as coverages from the RTMA *2dvaranl_ndfd.grb2_wexp hourly files:

image

  1. Cloud ceiling --I don't think that this is the variable representing Cloud Cover. Edit: I think I need Total Cloud Cover from the varanl dataset.
  2. Specific Humidity--I don't think that is the same as Relative Humidity. Might have to derive that.
  3. The precipitation-associated variables are missing in these files. Edit: Total Precipitation [kg/m^2] is in the rtma2p5.YYYYMMDDCC.pcp.184.grb2 files.
  4. RTMA Mosaic is now updated from ftp hourly, 10 minutes past the hour.

Example DescribeCoverage request for rtma:Temperature (on Internet 2)

wmjolly commented 5 years ago

So, we'll have to calculate RH for the RTMA from air temperature and dewpoint temperature. We do not need the Cloud_ceiling raster, we need Cloud Cover or Sky Cover. Specific Humidity and Dewpoint are related, so we'll just use Dewpoint.

On Wed, Jul 24, 2019 at 9:38 AM alexander-petkov notifications@github.com wrote:

These are the variables I have currently configured as coverages from the RTMA *2dvaranl_ndfd.grb2_wexp hourly files:

[image: image] https://user-images.githubusercontent.com/39599557/61806435-e52bf700-adf4-11e9-90e7-61da2ed1a8ab.png

  1. Cloud ceiling --I don't think that this is the variable representing Cloud Cover
  2. Specific Humidity--I don't think that is the same as Relative Humidity. Might have to derive that.
  3. The precipitation-associated variables are missing in these files.
  4. RTMA Mosaic is now updated from ftp hourly, 10 minutes past the hour.

Example DescribeCoverage request for rtma:Temperature http://192.168.59.41:8081/geoserver/ows?service=WCS&version=1.0.0&request=DescribeCoverage&coverage=rtma:Temperature (on Internet 2)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/alexander-petkov/wfas/issues/10?email_source=notifications&email_token=AA4G3D3OFZVFMDOY4LI7PTLQBBZQ7A5CNFSM4HWRPELKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2WXTGY#issuecomment-514685339, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4G3DZMIDZVDDM575KABWLQBBZQ7ANCNFSM4HWRPELA .

alexander-petkov commented 5 years ago

So, we'll have to calculate RH for the RTMA from air temperature and dewpoint temperature. We do not need the Cloud_ceiling raster, we need Cloud Cover or Sky Cover. Specific Humidity and Dewpoint are related, so we'll just use Dewpoint.

Thanks Matt, I edited my previous comment.

alexander-petkov commented 5 years ago

I updated the list of RTMA coverages:

image

There are two RTMA mosaics configured now, since PCP is in its own dataset... I also changed the archive update script to include PCP.

At this time I am only missing RH. I need to find out what formula is used for calculating it.

wmjolly commented 5 years ago

RH is calculated from Temperature and Dewpoint_temperature as follows:

RH = (VP(Dewpoint_temperature) / VP(Temperature)) * 100.0

Where:

float CalcVP(int iTemp){ / iTemp: Temperature in Fahrenheit / / Purpose: Calculate the saturation vapor pressure / / Convert Temperature to Kelvin / float kTemp = (iTemp - 32) / 1.8 + 273.15; / Return the saturation vapor pressure at the given temperature / return exp((float)1.81 + (kTemp * 17.27 - 4717.31) / (kTemp - 35.86)); }

On Thu, Jul 25, 2019 at 1:20 PM alexander-petkov notifications@github.com wrote:

I updated the list of RTMA coverages:

[image: image] https://user-images.githubusercontent.com/39599557/61902191-b722e180-aede-11e9-8ba9-e11b1db09cf0.png

At this time I am only missing RH. I need to find out what formula is used for calculating it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/alexander-petkov/wfas/issues/10?email_source=notifications&email_token=AA4G3D7P73WBXGZFOALMCJTQBH4HBA5CNFSM4HWRPELKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD22QHKQ#issuecomment-515179434, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4G3D2KTBWBV23H7GPJGA3QBH4HBANCNFSM4HWRPELA .

alexander-petkov commented 5 years ago

RH is calculated from Temperature and Dewpoint_temperature as follows: RH = (VP(Dewpoint_temperature) / VP(Temperature)) * 100.0

Thanks!! So, using this formula, Relative humidity can be output as follows with cdo ./cdo -expr,'2r=(exp(1.81+(2d*17.27- 4717.31) / (2d - 35.86))/exp(1.81+(2t*17.27- 4717.31) / (2t - 35.86)))*100' rtma2p5.t07z.2dvaranl_ndfd.grb2_wexp rhm.grb2

where 2d is Dewpoint Temperature (K), and 2t is Temperature (K)

wmjolly commented 5 years ago

Dang, that's cool.

MJ

On Fri, Jul 26, 2019 at 12:17 PM alexander-petkov notifications@github.com wrote:

RH is calculated from Temperature and Dewpoint_temperature as follows: RH = (VP(Dewpoint_temperature) / VP(Temperature)) * 100.0

Thanks!! So, using this formula, Relative humidity can be output as follows with cdo ./cdo -expr,'2r=(exp(1.81+(2d17.27- 4717.31) / (2d - 35.86))/exp(1.81+(2t17.27- 4717.31) / (2t - 35.86)))*100' rtma2p5.t07z.2dvaranl_ndfd.grb2_wexp rhm.grb2

where 2d is Dewpoint Temperature (K), and 2t is Temperature (K)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/alexander-petkov/wfas/issues/10?email_source=notifications&email_token=AA4G3D2N3FYOTQ52DZWQJWLQBM5TNA5CNFSM4HWRPELKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD25LBLQ#issuecomment-515551406, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4G3D5UDLUGN7EPHMSLRZLQBM5TNANCNFSM4HWRPELA .

alexander-petkov commented 5 years ago

I have added the calculated RH as an additional band to the RTMA files with cdo merge. However, Geoserver cannot read the geog extent for the newly computed band.

Interestingly enough, Geoserver will read just fine the newly-computed 1-band RH file...

So, at this point, I will keep the files separate, and will configure RH as a separate ImageMosaic coverage store.

I will keep thinking of a more elegant solution.

alexander-petkov commented 5 years ago

It takes a very long time to configure ImageMosaic using RTMA data.... Server becomes "busy" for a long time, it may take 4-5 hours and more.

I should try configuring ImageMosaic on a physical machine, and time it. Maybe throwing more RAM on manager1 will help.

wmjolly commented 5 years ago

Bummer. I was hoping it would be faster using ImageMosaic.

On Fri, Aug 9, 2019, 08:00 alexander-petkov notifications@github.com wrote:

It takes a very long time to configure ImageMosaic using RTMA data.... Server becomes "busy" for a long time, it may take 4-5 hours and more.

I should try configuring ImageMosaic on a physical machine, and time it. Maybe throwing more RAM on manager1 will help.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/alexander-petkov/wfas/issues/10?email_source=notifications&email_token=AA4G3D2LSJ55ZLFU2LXJ4Y3QDV2BHA5CNFSM4HWRPELKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD36YFXQ#issuecomment-519930590, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4G3D3CHUJWVEZZYOODWQ3QDV2BHANCNFSM4HWRPELA .

wmjolly commented 5 years ago

I know, and respect, that you want to keep everything in their native file formats but I am just not convinced that GRIB2 files are all that efficient. We might need to explore using a different format like GeoTIFF and using some of the flags for building pyramids and compression that can make it more efficient.

On Fri, Aug 9, 2019 at 8:32 AM Matt Jolly wmjolly@gmail.com wrote:

Bummer. I was hoping it would be faster using ImageMosaic.

On Fri, Aug 9, 2019, 08:00 alexander-petkov notifications@github.com wrote:

It takes a very long time to configure ImageMosaic using RTMA data.... Server becomes "busy" for a long time, it may take 4-5 hours and more.

I should try configuring ImageMosaic on a physical machine, and time it. Maybe throwing more RAM on manager1 will help.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/alexander-petkov/wfas/issues/10?email_source=notifications&email_token=AA4G3D2LSJ55ZLFU2LXJ4Y3QDV2BHA5CNFSM4HWRPELKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD36YFXQ#issuecomment-519930590, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4G3D3CHUJWVEZZYOODWQ3QDV2BHANCNFSM4HWRPELA .

alexander-petkov commented 5 years ago

For some reason I haven't succeeded configuring computed RHM files as an ImageMosaic in Geoserver.

Individually, they display fine.

I will keep trying.

alexander-petkov commented 5 years ago

I know, and respect, that you want to keep everything in their native file formats but I am just not convinced that GRIB2 files are all that efficient. We might need to explore using a different format like GeoTIFF and using some of the flags for building pyramids and compression that can make it more efficient.

Good point--I can experiment with that on the derived RTMA Relative Humidity layer.

alexander-petkov commented 5 years ago

Align the pcp archive with varanl during download. The pcp archive is temporally different (it goes further back in time), and the extra archive files are not useful for our needs anyway, as I understand.

Update: to make things even worse, some files from the FTP archive might be missing. Like the file for 20Z for 20190911:

image

alexander-petkov commented 5 years ago

I changed the RTMA update script to make Geotiffs for all mosaics.

RH maps are now upside down for some reason. All the rest are OK. Screenshot from 2019-10-16 07-05-23

alexander-petkov commented 5 years ago

Another quirk--diring initial configuration, the Imagemosaic plugin automaticalli configures all mosaics as backed by NetCDF files. This is because I kept the original file names, to make archive syncing easier. As a result, all data was shown as 0.

I tricked it by replacing the SuggestedSPI parameter like so:

SuggestedSPI=it.geosolutions.imageioimpl.plugins.tiff.TIFFImageReaderSpi

Deleted, and reconfigured all mosaics again, without deleting the Postgis indexing tables.

alexander-petkov commented 5 years ago

Use gdal_translate with '-co PROFILE=GeoTIFF', otherwise the GRIB metadata tags created by the GDALGeoTIFF default profile cause all mosaics to revert to the NetCDFImageReaderSPI.

alexander-petkov commented 3 years ago

Configure RTMA data retrieval for Alaska

alexander-petkov commented 1 year ago

Major overhaul for RTMA data retrieval and archiving, mostly due to provider changes.

  1. Alll common functions are unified in an "API script" (in this commit).
  2. Update scripts for CONUS and AK are rewritten to call functions in the script above (in this commit).

Now there is an issue with displaying RH via WMS. For some reason Geoserver's WMS server is picking a NetCDF reader for RH mosaics, although files are Geotiffs....