GenericMappingTools / gmtserver-admin

Cache data and script for managing the GMT data server
GNU Lesser General Public License v3.0
7 stars 3 forks source link

Update wdmam (2.1) #204

Closed Esteban82 closed 1 year ago

Esteban82 commented 1 year ago

I add in the makefile the command for earth_wdmam. I also update the recipe and the information for the server.

The recipe works but I have some doubts:

Esteban82 commented 1 year ago

I update the files in the test server. I only have to update the gmt_data_server.txt

PaulWessel commented 1 year ago

If you manually go to a web page then the critical commands in the downloader are to executed:

# 5.1 Determine if this source is an URL and if we need to download it first
is_url=$(echo ${SRC_FILE} | grep -c :)
if [ $is_url ]; then    # Data source is an URL
    if [ ! -f ${SRC_BASENAME} ]; then # Must download first
        echo "srv_downsampler_grid.sh: Must download original source ${SRC_FILE}"
        curl -k ${SRC_FILE} --output ${SRC_BASENAME}
    fi
    SRC_ORIG=${SRC_FILE}
    SRC_FILE=${SRC_BASENAME}
fi

You should hide away your grid.txt file and run the script and let it get the file via curl and set the right names etc. Also Choi et al, what year?

Esteban82 commented 1 year ago

Also Choi et al, what year?

As indicated in #203, the webpage shows no year.

Please cite version 2.1 as: Choi, Y., Dyment, J., Lesur, V., Garcia Reyes, Catalan, M., Ishihara, T., Litvinova, T., Hamoudi, M., the WDMAM Task Force*, and the WDMAM Data Providers**, World Digital Magnetic Anomaly Map version 2.1, map available at http://www.wdmam.org/.

Esteban82 commented 1 year ago

The original grid as a range of values -3323.70385742 to 8915.15527344.

So, I change the precision to 0.2 nT (instead of 0.3). I had to change the offset to 3000.

I have to update the files in the server.

Esteban82 commented 1 year ago

If you manually go to a web page then the critical commands in the downloader are to executed:

# 5.1 Determine if this source is an URL and if we need to download it first
is_url=$(echo ${SRC_FILE} | grep -c :)
if [ $is_url ]; then  # Data source is an URL
  if [ ! -f ${SRC_BASENAME} ]; then # Must download first
      echo "srv_downsampler_grid.sh: Must download original source ${SRC_FILE}"
      curl -k ${SRC_FILE} --output ${SRC_BASENAME}
  fi
  SRC_ORIG=${SRC_FILE}
  SRC_FILE=${SRC_BASENAME}
fi

You should hide away your grid.txt file and run the script and let it get the file via curl and set the right names etc.

I think that, in order to do this, I need the real URL of the source file (grid.txt). But, I can't find it in the webpage. There I have to press a buttom to get the file. I try to inspect the webpage but I am not very good at this.

image

PaulWessel commented 1 year ago

Try the old URL but with grid.txt.

it always amazes me when data providers think a web button is a good idea…

Esteban82 commented 1 year ago

Try the old URL but with grid.txt.

Yes, that is how I change the SRC_FILE (ttps://wdmam.org/file/grid.txt). So, not working.

Esteban82 commented 1 year ago

So, I change the precision to 0.2 nT (instead of 0.3). I had to change the offset to 3000.

I think that I do this wrong. I will check.

joa-quim commented 1 year ago

Don’t know why you are doing it but mind you that those grids have a precision of only a couple nTs.

PaulWessel commented 1 year ago

OK, probably not worth doing. Bug you should kick those frogs who think registering for doing a download is a cool thing. Perhaps we should add a similar system for GMT, but only for them?

Esteban82 commented 1 year ago

Don’t know why you are doing it but mind you that those grids have a precision of only a couple nTs.

Yes, I know that it is irrelevant.I just wanted to learn a little about scale, offset and 16-bit integer.

For a 16-bit, I have 65,536 values. So the precission is ok.

Esteban82 commented 1 year ago

I change the script to use this SCR_FILE indicated by Paul in #203.

I also have to change the script to make the grid.

# SRC_CUSTOM="gmt xyz2grd grid.asc.txt -i0,1,2 -Rd -I3m -fg -Ggrid.asc.nc"

Esteban82 commented 1 year ago

I think that this PR can be approved.

PaulWessel commented 1 year ago

OK, doing so without actually testing!

Esteban82 commented 1 year ago

Closes #203.

seisman commented 1 year ago

@Esteban82 I'd like to know if the wdmam data on the data server have been updated.

In PyGMT, we suddenly have a new failing test.

gmt grdcut @earth_wdmam_03m_g -R10/13/-60/-58 -Gout.nc
gmt grdinfo out.nc -C

The above commands give:

out.nc  10  13  -60 -58 1640.19995117   2486.39990234   0.05    0.05    61  41  0   1

As you can see, the minimum value is 1640.19995117 in this region, but our PyGMT test shows that the value was -639.7001.

Esteban82 commented 1 year ago

I have only updated the data in the test server. (Unless I have done it by error).

seisman commented 1 year ago
-bash-4.2$ md5sum data/server/earth/earth_wdmam/earth_wdmam_03m_g/*
898fba6ccbbbd0b2150369d28139f48a  data/server/earth/earth_wdmam/earth_wdmam_03m_g/N00E000.earth_wdmam_03m_g.jp2
43c5b171bd2869c65c6d874f7bd58ce6  data/server/earth/earth_wdmam/earth_wdmam_03m_g/N00E090.earth_wdmam_03m_g.jp2
ac48d6a3d5c490b89a80bb88030be0ef  data/server/earth/earth_wdmam/earth_wdmam_03m_g/N00W090.earth_wdmam_03m_g.jp2
ac9dc74da13b0cf62121674e2adaf9fe  data/server/earth/earth_wdmam/earth_wdmam_03m_g/N00W180.earth_wdmam_03m_g.jp2
878164ccaacbaf97bea93421cf118e6e  data/server/earth/earth_wdmam/earth_wdmam_03m_g/S90E000.earth_wdmam_03m_g.jp2
e26b7dbc548b887730d2669a54296b23  data/server/earth/earth_wdmam/earth_wdmam_03m_g/S90E090.earth_wdmam_03m_g.jp2
d797c642f0dd62959bbcbe5ece7f863a  data/server/earth/earth_wdmam/earth_wdmam_03m_g/S90W090.earth_wdmam_03m_g.jp2
c6985578454eee8c2624de639ddde850  data/server/earth/earth_wdmam/earth_wdmam_03m_g/S90W180.earth_wdmam_03m_g.jp2
-bash-4.2$ md5sum test/server/earth/earth_wdmam/earth_wdmam_03m_g/*
45af9c97590c87c09edc269896d0c034  test/server/earth/earth_wdmam/earth_wdmam_03m_g/N00E000.earth_wdmam_03m_g.jp2
c979d8675f9676d8ee60a96e03e6bd06  test/server/earth/earth_wdmam/earth_wdmam_03m_g/N00E090.earth_wdmam_03m_g.jp2
066b3d49b96cd2a4f88ef84757649d46  test/server/earth/earth_wdmam/earth_wdmam_03m_g/N00W090.earth_wdmam_03m_g.jp2
47c46505ceeb2d1a98593bc717abcd2c  test/server/earth/earth_wdmam/earth_wdmam_03m_g/N00W180.earth_wdmam_03m_g.jp2
a28cee6b9e7e9c24c0cb8e150fba5059  test/server/earth/earth_wdmam/earth_wdmam_03m_g/S90E000.earth_wdmam_03m_g.jp2
84aa68f556cca31186b9df3e810152e3  test/server/earth/earth_wdmam/earth_wdmam_03m_g/S90E090.earth_wdmam_03m_g.jp2
981066c20436cbbb3f3186d40b8b19d2  test/server/earth/earth_wdmam/earth_wdmam_03m_g/S90W090.earth_wdmam_03m_g.jp2
2ab1a17da0ad62c95dce85364375f92c  test/server/earth/earth_wdmam/earth_wdmam_03m_g/S90W180.earth_wdmam_03m_g.jp2

Data files in the data and test directories have different md5sum, so I think what you have done is correct. Then it's more weird why we have a different value in PyGMT.

seisman commented 1 year ago

It's more weird that I have another total different output using the test server:

$ gmt grdcut @earth_wdmam_03m_g -R10/13/-60/-58 -Gout.nc
grdblend [NOTICE]: Remote data courtesy of GMT data server test [http://test.generic-mapping-tools.org]

grdblend [NOTICE]: WDMAM Earth Magnetic Anomalies v2.1 original at 03x03 arc minutes [Choi et al. 2023].
grdblend [NOTICE]:   -> Download 90x90 degree grid tile (earth_wdmam_03m_g): S90E000
$ gmt grdinfo out.nc -C                                 
out.nc  10  13  -60 -58 -790.199951172  528 0.05    0.05    61  41  0   1
Esteban82 commented 1 year ago

Yes, I check in the server. The data is from January 2022.

-bash-4.2$ ls -l
total 25884
-rw-rw-r-- 1 pwessel gmt  101608 Jan 26  2022 earth_wdmam_01d_g.grd
-rw-rw-r-- 1 pwessel gmt  101439 Jan 26  2022 earth_wdmam_01d_p.grd
drwxrwxr-x 2 pwessel gmt     302 Jan 26  2022 earth_wdmam_03m_g
drwxrwxr-x 2 pwessel gmt      80 Jan 26  2022 earth_wdmam_04m_g
drwxrwxr-x 2 pwessel gmt      80 Jan 26  2022 earth_wdmam_04m_p
drwxrwxr-x 2 pwessel gmt      80 Jan 26  2022 earth_wdmam_05m_g
drwxrwxr-x 2 pwessel gmt      80 Jan 26  2022 earth_wdmam_05m_p
-rw-rw-r-- 1 pwessel gmt 7735608 Jan 26  2022 earth_wdmam_06m_g.grd
-rw-rw-r-- 1 pwessel gmt 7659355 Jan 26  2022 earth_wdmam_06m_p.grd
-rw-rw-r-- 1 pwessel gmt 2949996 Jan 26  2022 earth_wdmam_10m_g.grd
-rw-rw-r-- 1 pwessel gmt 2943161 Jan 26  2022 earth_wdmam_10m_p.grd
-rw-rw-r-- 1 pwessel gmt 1352963 Jan 26  2022 earth_wdmam_15m_g.grd
-rw-rw-r-- 1 pwessel gmt 1354433 Jan 26  2022 earth_wdmam_15m_p.grd
-rw-rw-r-- 1 pwessel gmt  778865 Jan 26  2022 earth_wdmam_20m_g.grd
-rw-rw-r-- 1 pwessel gmt  776768 Jan 26  2022 earth_wdmam_20m_p.grd
-rw-rw-r-- 1 pwessel gmt  365660 Jan 26  2022 earth_wdmam_30m_g.grd
-rw-rw-r-- 1 pwessel gmt  364175 Jan 26  2022 earth_wdmam_30m_p.grd
-bash-4.2$ pwd
/export/gmtserver/gmt/data/server/earth/earth_wdmam
seisman commented 1 year ago
gmt grdcut @earth_wdmam_03m_g -R10/13/-60/-58 -Gout.nc
gmt grdinfo out.nc -C

Using the oceanic server, the above commands produces:

out.nc  10  13  -60 -58 1640.19995117   2486.39990234   0.05    0.05    61  41  0   1

the output doesn't make sense because the minimum and maximum values of the WDMAM dataset should be about -500 nT and 500 nT respectively (see the colorbar at https://wdmam.org/).

Could you please verify it?

seisman commented 1 year ago

I believe I understand why the dataset changed suddenly.

This PR changed the scale and offset properties of the dataset and updated the information/earth_wdmam_server.txt file, which was used to build the information/gmt_data_server.txt file.

Thus, now the gmt_data_server.txt file on the data server has the scale/offset for the new dataset while the actual data files are still old. So we need to update the WDMAM data files to fix the issue (that's also why the commands work well with the test data server https://github.com/GenericMappingTools/gmtserver-admin/pull/204#issuecomment-1667079041).

seisman commented 1 year ago

In my opinion, we should update/add only one dataset one time and should follow the following steps:

  1. Open an issue report
  2. Open a PR, update the dataset information, rebuild the dataset and manually copy the dataset and the new gmt_data_server.txt file to the test server
  3. Ask people to use the test server and test the updated dataset to make sure that everything works well
  4. Merge the PR and copy the data files from the test server to the data server
  5. Open a separate PR in https://github.com/GenericMappingTools/remote-datasets and update the dataset page and the changelog page
  6. Close the original issue report

Edit: I think we should have an issue template for adding a new dataset or updating an existing dataset, although the detailed steps need more discussion.

PaulWessel commented 1 year ago

Agree with the checklist. Also, better documentation/howto for both gmtsever-admin and remote data.

Esteban82 commented 1 year ago

Could you please verify it?

Yes, I got the same result.

Esteban82 commented 1 year ago

I believe I understand why the dataset changed suddenly.

This PR changed the scale and offset properties of the dataset and updated the information/earth_wdmam_server.txt file, which was used to build the information/gmt_data_server.txt file.

Thus, now the gmt_data_server.txt file on the data server has the scale/offset for the new dataset while the actual data files are still old. So we need to update the WDMAM data files to fix the issue (that's also why the commands work well with the test data server #204 (comment)).

Ok, I see that the issue is that the scale and offset are included in gmt_data_server.txt. I will make a PR to fix this.