GenericMappingTools / remote-datasets

Documentation for remote datasets on the GMT server
https://www.generic-mapping-tools.org/remote-datasets/
5 stars 6 forks source link

Release of GEBCO 2022 #41

Closed Esteban82 closed 1 year ago

Esteban82 commented 2 years ago

It seems that in June a new version was released.

https://www.gebco.net/data_and_products/gridded_bathymetry_data/gebco_2022/

PaulWessel commented 2 years ago

I wonder if we should test if the instructions for doing this are good enough so that someone else than me (perhaps @Esteban82 !) could also do it. I think @maxrjones did the last update and I believe we updated instructions and docs whenever anything was unclear? If you do not already have it you would need to clone the gmtserver-admin repo first. Then, we would need to update the recipe files for Gebco to select the latest URL file, then rebuild the files etc. etc.

Esteban82 commented 2 years ago

Ok. Yes, it seems a good idea. I think I could try to do it on the weekend or next week.

PaulWessel commented 2 years ago

Sure. When you are ready for it just let me know and I can guide you through it. Basically, all the resolution files and tiles will be automatically made by two scripts. Then I think one of the scripts writes a new section for the gmt_data_server.txt file for the server. I believe we still did a manual copy to the server - we will look at that once you have successfully made all the files locally. I will probably need to give you an account on the server in UH.

Esteban82 commented 2 years ago

Ok. Do I need any permission to work in the gmtserver-admin repo?

PaulWessel commented 2 years ago

Not sure, will check in a bit.

Esteban82 commented 2 years ago

I don't need to download the gebco grids to my pc, right?

BTW. Is it a good idea to add the GEBCO Type Identifier (TID) Grid? It is 4 GB.

PaulWessel commented 2 years ago

Actually, the way it works now is that whoever is doing this is doing it on their own computer. But, we should be able to set it up so it can be done on an account on the server. Not tried that though.

If TID grid is a grid with source numbers that says what cruise or dataset was used for each node then that is the same as what Sandwell does (SID grid). So far I have assumed those grids are for experts only and not included them in GMT. IT also makes no sense to make downsampled version of those. I think the best we could do is to add info on where to get these.

maxrjones commented 2 years ago

@Esteban82 you can open a branch and submit/merge PRs in the gmtserver-admin repo.

IIRC I updated earth_relief and earth_synbath but not earth_gebco. See https://github.com/GenericMappingTools/gmtserver-admin/blob/37aad71dbf559b6bb290ea79f3a68f7885ec7dd2/.github/workflows/dataset-check.yml#L45-L56 for the checklist that I used for earth_relief. We should create an issue template that's generic for other datasets. In the meantime ping with any questions about translating those steps for earth_gebco.

Yes, you will need permissions for the gmt server in order to actually place the files.

The recipe assumes that the data was manually downloaded and then placed on Paul's ftp site. I'm not sure why that requirement exists, but perhaps @PaulWessel knows. (Referring to https://github.com/GenericMappingTools/gmtserver-admin/blob/37aad71dbf559b6bb290ea79f3a68f7885ec7dd2/recipes/earth_gebco.recipe#L7-L8)

PaulWessel commented 2 years ago

IIRC I think it was because it was not possibly to download via the command-line using curl. Many data sites are like that - requiring you to click menus, say OK, do this, do that, and then some download starts. Not suitable for scripting. So since I figured I had to do this many times to get things right I placed it on my ftp server. But normally, the recipes (see the other ones) just download to your directory. I use the gmtserver-admin/staging directory for doing all the local work - that is what it is for. So @Esteban82 could just download it locally to his staging directory and edit the recipe accordingly.

Esteban82 commented 2 years ago

If TID grid is a grid with source numbers that says what cruise or dataset was used for each node then that is the same as what Sandwell does (SID grid). So far I have assumed those grids are for experts only and not included them in GMT. IT also makes no sense to make downsampled version of those. I think the best we could do is to add info on where to get these.

The TID is type of data: Singlebeam, multbeam,Seismic,Predicted from gravity, etc. Ok, I agree to NOT include it. As it is for expert, then I don't think that we should add any info.

Esteban82 commented 2 years ago

BTW, Sandwell updated the gravity grids yesterday (https://topex.ucsd.edu/pub/global_grav_1min/README_V32.txt).

PaulWessel commented 2 years ago

We need to set up bots to check those things like we do for the scientific color maps.

maxrjones commented 2 years ago

It's set up for earth_relief and earth_synbath in https://github.com/GenericMappingTools/gmtserver-admin/blob/37aad71dbf559b6bb290ea79f3a68f7885ec7dd2/.github/workflows/dataset-check.yml. The other topex grids should be easy to set up, but I'm not sure what the query would look like for checking gebco, wdmam, etc.

Esteban82 commented 2 years ago

I will try to update first gravity grids.

Is this check list ok?

        To-do list:
        - [ ] Check https://topex.ucsd.edu/pub/global_grav_1min/ for a new release
        - [ ] Update `recipes/earth_faa.recipe` with the new file name and version
        - [ ] Run `srv_downsampler_grid.sh earth_faa` from the `gmtserver-admin/scripts`  ??
        - [ ] Copy `staging/earth_faa_server.txt` to `information/`
        - [ ] Run `make server-info` from the `gmtserver-admin` top dir
        - [ ] Place the new `earth_faa` files on the GMT 'test' data server
        - [ ] Test the new files (e.g., https://github.com/GenericMappingTools/remote-datasets/blob/main/scripts/remote_map_check.sh)
        - [ ] Update `srtm_version` in `.github/workflows/srtm-check.yml`
        - [ ] Commit changes in a new branch and open a PR
        - [ ] Move files to GMT 'oceania' data server before merging PR
Esteban82 commented 2 years ago

Until now I've done the following steps:

    To-do list:
    - [ ] Check https://topex.ucsd.edu/pub/global_grav_1min/ for a new release
    - [ ] Update `recipes/earth_faa.recipe` with the new file name and version
    - [ ] Run `srv_downsampler_grid.sh earth_faa` from the `gmtserver-admin/scripts`
    - [ ] Run `srv_tiler.sh earth_faa` from the `gmtserver-admin/scripts` 
    - [ ] Copy `staging/earth_faa_server.txt` to `information/`
    - [ ] Run `make server-info` from the `gmtserver-admin` top dir
Esteban82 commented 2 years ago

On the earth_faa.recipe file says:

# The range of -412.054656982 to +942.417297363 mGal means we may use offset of 300 and scale of 0.025

Is it possible to know how the offset and scale works?

And, more importante, for the v32 I got the below values. Should I modified those values (offset and scale) on the recipe?

/home/federico/.gmt/grav_32.1.nc: v_min: -388.633911133 v_max: 941.817321777 name: z

PaulWessel commented 2 years ago

Yes, that probably should be documented!

It comes from the fact we store the data as short integers (16 bits) which has a range of -32768 to +32767. So the scale and offset are selected so that rint [(data - offset) / scale] fits in that range. With your new values we find

gmt math -Q -388.633911133 300 SUB 0.025 DIV RINT =
-27545
gmt math -Q 941.817321777 300 SUB 0.025 DIV RINT =
25673

so those values still work. We want the scale to be a sensible number, not the smallest possible. so that 1/scale is an integer, e.g. 1/scale = 40. So those are the "rules". It means that 1 unit in the integer grid is 0.025 mGal and that is the precision of our version of the data. Given that these grids do not change much as they improve we probably never have to change those settings.

Esteban82 commented 2 years ago

I already have an UH account.

What should I do now? These? How exactly can I do those?

    - [ ] Place the new `earth_faa` files on the GMT 'test' data server
    - [ ] Test the new files (e.g., https://github.com/GenericMappingTools/remote-datasets/blob/main/scripts/remote_map_check.sh)
PaulWessel commented 2 years ago

I just went in and cleaned up the _earthfaa folder so it is empty. You will need to scp your files to

/export/gmtserver/gmt/test/server/earth/earth_faa

Then you need to place the updated gmt_data_server.txt file in

/export/gmtserver/gmt/test

Finally, to actually access those files you need to do

gmt set GMT_DATA_SERVER test

and then try to make a plot using @earthfaa*

Be careful so you do not accidentally do anything in /export/gmtserver/gmt/data

Esteban82 commented 2 years ago

I try this command. And after I type my password, I got a Permission denied message. Do I need I permission? Or there is something wroong with my scp command?

scp -r /home/federico/Software/Github/gmtserver-admin/staging/earth/earth_faa esteban82@gmtserver.SOEST.Hawaii.edu:/export/gmtserver/gmt/test/server/earth/earth_faa
esteban82@gmtserver.soest.hawaii.edu's password: 
scp: /export/gmtserver/gmt/test/server/earth/earth_faa/earth_faa: Permission denied
PaulWessel commented 2 years ago

Not sure, I asked that you be put in the gmt group but perhaps they forgot. Anyway, if you are copying the directory like that then there should not be a trailing earh_faa on the receiving side. SInce I left the previous dir there, I just removed it. SO you should try

scp -r /home/federico/Software/Github/gmtserver-admin/staging/earth/earth_faa esteban82@gmtserver.SOEST.Hawaii.edu:/export/gmtserver/gmt/test/server/earth

and if that fails then please ssh into the server as you and typ

groups

if it does not return

gmt

among other groups then we email SOEST IT to fix that.

Esteban82 commented 2 years ago

I got only esteban82 when I type groups:

-bash-4.2$ groups
esteban82
PaulWessel commented 2 years ago

OK, that is the problem, I have emailed IT.

Esteban82 commented 2 years ago

Now I have permission. The FAA data is been updated in the TEST directory right now.

Esteban82 commented 2 years ago

I just went in and cleaned up the _earthfaa folder so it is empty. You will need to scp your files to

/export/gmtserver/gmt/test/server/earth/earth_faa

Then you need to place the updated gmt_data_server.txt file in

/export/gmtserver/gmt/test

Finally, to actually access those files you need to do

gmt set GMT_DATA_SERVER test

I done it. It seems ok. I got a map and in the terminal it says the new version.

Could you confirm it? What is the next step?

PaulWessel commented 2 years ago

Yes, looks like it:

grdimage [NOTICE]: IGPP Earth Free Air Gravity Anomalies v32 at 10x10 arc minutes reduced by Gaussian Cartesian filtering (18 km fullwidth) [Sandwell et al., 2019].

Could you confirm it? What is the next step?

Is there a v32 VGG (curv) data set as well? Perhaps we should do both before releasing the update. I think that is it for Sandwell, no? I.e., there is not a new srtm topo grid yet, right?

Esteban82 commented 2 years ago

Ok, I will do the same with the VGG data set.

No, there isn't any newer version of the SRTM15 data set (the current is 2.4 from march).

Should I also make the GEBCO for the release?

PaulWessel commented 2 years ago

If you have the time, yes, please do the GEBCO as well. Note that recipe has that URL on the SOEST server because I needed to run tests many time and it was not a good experience copying that 8Gb file from the UK to Hawaii. I think if you copy and place that file in your staging directory then that is where the download script will look first and use it.

Esteban82 commented 2 years ago

To empty the vgg folder, I should enter the server (/export/gmtserver/gmt/test/server/earth/earth_vgg) and do rm -r earth_vgg* , right?

Esteban82 commented 2 years ago

The vgg was uploaded and seems fine.

For the GEBCO grid I got a memory error. So I think I won't be able to do it.

Convert GEBCO_2022.nc to earth/earth_gebco/earth_gebco_15s_p.grd=ns+s1+o0
grdconvert (gmtapi_import_grid): Could not reallocate memory [13.91 Gb, 3732998416 items of 4 bytes]
PaulWessel commented 2 years ago

OK, I can do the GEBCO. It does read and write the entire thing so that takes probably 16Gb+ depending on other things on the system. Mine have 64Gb :-)

Esteban82 commented 1 year ago

Well, I think that this can be closed.

I see that this issue has info on how to update the remote data server and others things. So I will add a label (how-to) so it could be easily found.