Open alexander-petkov opened 5 years ago
I think the current command in crontab downloads the entire archive, but due to the nd option, it overwrites files, only saving locally the last day of data:
wget -nd --recursive --directory-prefix=somedir -N --no-parent -R*_nobiasc* -Artma2p5.*.2dvaranl_ndfd.grb2_wexp ftp://ftp.ncep.noaa.gov/pub/data/nccf/com/rtma/prod
I tried lftp as an alternative to wget:
sudo lftp -f "
open ftp://ftp.ncep.noaa.gov
lcd somelocaldir
mirror --continue --delete --verbose --only-missing --no-empty-dirs -Irtma2p5.*/rtma2p5.*.2dvaranl_ndfd.grb2_wexp /pub/data/nccf/com/rtma/prod/ .
bye "
Would it also remove locally created files (cached data from the ImageMosaic plugin)? This needs to be tested.
I have another script that doesn't remove the data that I can share as well. I've never used lftp. Does it work well?
I think the current command in crontab downloads the entire archive, but due to the nd option, it overwrites files, only saving locally the last day of data:
wget -nd --recursive --directory-prefix=somedir -N --no-parent -R*_nobiasc* -Artma2p5.*.2dvaranl_ndfd.grb2_wexp ftp://ftp.ncep.noaa.gov/pub/data/nccf/com/rtma/prod
I was really only using the last 24 hours of data for my other process, so I didn't need to keep the old data. We could consider mirroring the current RTMA archive instead. We'd have to deal with different data directories but it would be clean and manageable. However, I extracted the files twice a day, so it built up the archive anyway, it just wasn't robust to download errors.
I tried lftp as an alternative to wget:
sudo lftp -f " open ftp://ftp.ncep.noaa.gov lcd somelocaldir mirror --continue --delete --verbose --only-missing --no-empty-dirs -Irtma2p5.*/rtma2p5.*.2dvaranl_ndfd.grb2_wexp /pub/data/nccf/com/rtma/prod/ . bye "
Would it also remove locally created files (cached data from the ImageMosaic plugin)? This needs to be tested. Yes, I think the process would need to update the RTMA data and clean the cached ImageMosaic files but the ImageMosaic files could be cleaned or updated less frequently if it is time consuming. Maybe every 3 hours? If it's fast, we should sync the RTMA archive with the ImageMosaic as often as time will allow.
I have another script that doesn't remove the data that I can share as well. I've never used lftp. Does it work well?
Me neither--it seems to do what I want in a not very messy way. Removes directories and files that are not on the remote server anymore.
Yes, I think the process would need to update the RTMA data and clean the cached ImageMosaic files but the ImageMosaic files could be cleaned or updated less frequently if it is time consuming. Maybe every 3 hours? If it's fast, we should sync the RTMA archive with the ImageMosaic as often as time will allow.
Ok, I will need to do some testing.
When mirroring the RTMA archive, LFTP is removing the cache files generated by the ImageMosaic plugin. This kills the mosaic coverage...
This wget request works pretty well:
wget --cut-dirs 6 -xnH -c --recursive --directory-prefix=somedir \
-N --no-parent -Artma2p5.*.2dvaranl_ndfd.grb2_wexp \
ftp://ftp.ncep.noaa.gov/pub/data/nccf/com/rtma/prod/rtma2p5.*
Older than 14 days rtma2p5.* directories can be cleaned up with find:
find somedir -name 'rtma2p5.*' -type d -ctime +14 -exec rm -rf {} \;
I have an initial script for syncing RTMA archive to ftp location, and updating the Imagemosaic coveragestore.
It is currently written as a bash script, which isn't exactly elegant and easily human readable, but does not require additional software.
It is scheduled via cron to run 10 minutes past every 3rd hour.
This is a subtask to #7
These are the variables I have currently configured as coverages from the RTMA *2dvaranl_ndfd.grb2_wexp hourly files:
Example DescribeCoverage request for rtma:Temperature (on Internet 2)
So, we'll have to calculate RH for the RTMA from air temperature and dewpoint temperature. We do not need the Cloud_ceiling raster, we need Cloud Cover or Sky Cover. Specific Humidity and Dewpoint are related, so we'll just use Dewpoint.
On Wed, Jul 24, 2019 at 9:38 AM alexander-petkov notifications@github.com wrote:
These are the variables I have currently configured as coverages from the RTMA *2dvaranl_ndfd.grb2_wexp hourly files:
[image: image] https://user-images.githubusercontent.com/39599557/61806435-e52bf700-adf4-11e9-90e7-61da2ed1a8ab.png
- Cloud ceiling --I don't think that this is the variable representing Cloud Cover
- Specific Humidity--I don't think that is the same as Relative Humidity. Might have to derive that.
- The precipitation-associated variables are missing in these files.
- RTMA Mosaic is now updated from ftp hourly, 10 minutes past the hour.
Example DescribeCoverage request for rtma:Temperature http://192.168.59.41:8081/geoserver/ows?service=WCS&version=1.0.0&request=DescribeCoverage&coverage=rtma:Temperature (on Internet 2)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/alexander-petkov/wfas/issues/10?email_source=notifications&email_token=AA4G3D3OFZVFMDOY4LI7PTLQBBZQ7A5CNFSM4HWRPELKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2WXTGY#issuecomment-514685339, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4G3DZMIDZVDDM575KABWLQBBZQ7ANCNFSM4HWRPELA .
So, we'll have to calculate RH for the RTMA from air temperature and dewpoint temperature. We do not need the Cloud_ceiling raster, we need Cloud Cover or Sky Cover. Specific Humidity and Dewpoint are related, so we'll just use Dewpoint.
Thanks Matt, I edited my previous comment.
I updated the list of RTMA coverages:
There are two RTMA mosaics configured now, since PCP is in its own dataset... I also changed the archive update script to include PCP.
At this time I am only missing RH. I need to find out what formula is used for calculating it.
RH is calculated from Temperature and Dewpoint_temperature as follows:
RH = (VP(Dewpoint_temperature) / VP(Temperature)) * 100.0
Where:
float CalcVP(int iTemp){ / iTemp: Temperature in Fahrenheit / / Purpose: Calculate the saturation vapor pressure / / Convert Temperature to Kelvin / float kTemp = (iTemp - 32) / 1.8 + 273.15; / Return the saturation vapor pressure at the given temperature / return exp((float)1.81 + (kTemp * 17.27 - 4717.31) / (kTemp - 35.86)); }
On Thu, Jul 25, 2019 at 1:20 PM alexander-petkov notifications@github.com wrote:
I updated the list of RTMA coverages:
[image: image] https://user-images.githubusercontent.com/39599557/61902191-b722e180-aede-11e9-8ba9-e11b1db09cf0.png
At this time I am only missing RH. I need to find out what formula is used for calculating it.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/alexander-petkov/wfas/issues/10?email_source=notifications&email_token=AA4G3D7P73WBXGZFOALMCJTQBH4HBA5CNFSM4HWRPELKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD22QHKQ#issuecomment-515179434, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4G3D2KTBWBV23H7GPJGA3QBH4HBANCNFSM4HWRPELA .
RH is calculated from Temperature and Dewpoint_temperature as follows: RH = (VP(Dewpoint_temperature) / VP(Temperature)) * 100.0
Thanks!! So, using this formula, Relative humidity can be output as follows with cdo
./cdo -expr,'2r=(exp(1.81+(2d*17.27- 4717.31) / (2d - 35.86))/exp(1.81+(2t*17.27- 4717.31) / (2t - 35.86)))*100' rtma2p5.t07z.2dvaranl_ndfd.grb2_wexp rhm.grb2
where 2d is Dewpoint Temperature (K), and 2t is Temperature (K)
Dang, that's cool.
MJ
On Fri, Jul 26, 2019 at 12:17 PM alexander-petkov notifications@github.com wrote:
RH is calculated from Temperature and Dewpoint_temperature as follows: RH = (VP(Dewpoint_temperature) / VP(Temperature)) * 100.0
Thanks!! So, using this formula, Relative humidity can be output as follows with cdo ./cdo -expr,'2r=(exp(1.81+(2d17.27- 4717.31) / (2d - 35.86))/exp(1.81+(2t17.27- 4717.31) / (2t - 35.86)))*100' rtma2p5.t07z.2dvaranl_ndfd.grb2_wexp rhm.grb2
where 2d is Dewpoint Temperature (K), and 2t is Temperature (K)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/alexander-petkov/wfas/issues/10?email_source=notifications&email_token=AA4G3D2N3FYOTQ52DZWQJWLQBM5TNA5CNFSM4HWRPELKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD25LBLQ#issuecomment-515551406, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4G3D5UDLUGN7EPHMSLRZLQBM5TNANCNFSM4HWRPELA .
I have added the calculated RH as an additional band to the RTMA files with cdo merge
. However, Geoserver cannot read the geog extent for the newly computed band.
Interestingly enough, Geoserver will read just fine the newly-computed 1-band RH file...
So, at this point, I will keep the files separate, and will configure RH as a separate ImageMosaic coverage store.
I will keep thinking of a more elegant solution.
It takes a very long time to configure ImageMosaic using RTMA data.... Server becomes "busy" for a long time, it may take 4-5 hours and more.
I should try configuring ImageMosaic on a physical machine, and time it. Maybe throwing more RAM on manager1 will help.
Bummer. I was hoping it would be faster using ImageMosaic.
On Fri, Aug 9, 2019, 08:00 alexander-petkov notifications@github.com wrote:
It takes a very long time to configure ImageMosaic using RTMA data.... Server becomes "busy" for a long time, it may take 4-5 hours and more.
I should try configuring ImageMosaic on a physical machine, and time it. Maybe throwing more RAM on manager1 will help.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/alexander-petkov/wfas/issues/10?email_source=notifications&email_token=AA4G3D2LSJ55ZLFU2LXJ4Y3QDV2BHA5CNFSM4HWRPELKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD36YFXQ#issuecomment-519930590, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4G3D3CHUJWVEZZYOODWQ3QDV2BHANCNFSM4HWRPELA .
I know, and respect, that you want to keep everything in their native file formats but I am just not convinced that GRIB2 files are all that efficient. We might need to explore using a different format like GeoTIFF and using some of the flags for building pyramids and compression that can make it more efficient.
On Fri, Aug 9, 2019 at 8:32 AM Matt Jolly wmjolly@gmail.com wrote:
Bummer. I was hoping it would be faster using ImageMosaic.
On Fri, Aug 9, 2019, 08:00 alexander-petkov notifications@github.com wrote:
It takes a very long time to configure ImageMosaic using RTMA data.... Server becomes "busy" for a long time, it may take 4-5 hours and more.
I should try configuring ImageMosaic on a physical machine, and time it. Maybe throwing more RAM on manager1 will help.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/alexander-petkov/wfas/issues/10?email_source=notifications&email_token=AA4G3D2LSJ55ZLFU2LXJ4Y3QDV2BHA5CNFSM4HWRPELKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD36YFXQ#issuecomment-519930590, or mute the thread https://github.com/notifications/unsubscribe-auth/AA4G3D3CHUJWVEZZYOODWQ3QDV2BHANCNFSM4HWRPELA .
For some reason I haven't succeeded configuring computed RHM files as an ImageMosaic in Geoserver.
Individually, they display fine.
I will keep trying.
I know, and respect, that you want to keep everything in their native file formats but I am just not convinced that GRIB2 files are all that efficient. We might need to explore using a different format like GeoTIFF and using some of the flags for building pyramids and compression that can make it more efficient.
Good point--I can experiment with that on the derived RTMA Relative Humidity layer.
Align the pcp archive with varanl during download. The pcp archive is temporally different (it goes further back in time), and the extra archive files are not useful for our needs anyway, as I understand.
Update: to make things even worse, some files from the FTP archive might be missing. Like the file for 20Z for 20190911:
I changed the RTMA update script to make Geotiffs for all mosaics.
RH maps are now upside down for some reason. All the rest are OK.
Another quirk--diring initial configuration, the Imagemosaic plugin automaticalli configures all mosaics as backed by NetCDF files. This is because I kept the original file names, to make archive syncing easier. As a result, all data was shown as 0.
I tricked it by replacing the SuggestedSPI parameter like so:
SuggestedSPI=it.geosolutions.imageioimpl.plugins.tiff.TIFFImageReaderSpi
Deleted, and reconfigured all mosaics again, without deleting the Postgis indexing tables.
Use gdal_translate with '-co PROFILE=GeoTIFF', otherwise the GRIB metadata tags created by the GDALGeoTIFF default profile cause all mosaics to revert to the NetCDFImageReaderSPI.
Configure RTMA data retrieval for Alaska
Major overhaul for RTMA data retrieval and archiving, mostly due to provider changes.
Now there is an issue with displaying RH via WMS. For some reason Geoserver's WMS server is picking a NetCDF reader for RH mosaics, although files are Geotiffs....
Configure an RTMA 14-day archive retrieval. The data is to be synced once per day, and local files kept and not overwritten.