GenericMappingTools / gmtserver-admin

Cache data and script for managing the GMT data server
GNU Lesser General Public License v3.0
7 stars 3 forks source link

rsync the server data #23

Closed PaulWessel closed 4 years ago

PaulWessel commented 5 years ago

Per SOEST IT staff, this is now configured and users need to run

rsync -rP '*' <destination_directory>

where <destination_directory> is the full path to where they want to mirror these files on their local computer. The quotes are needed do to the * wildcard. I just tested this on my Mac and it ran fine. We have replaced the symlinks with actual files and directories. Let me know how this is working for @joa-quim and @seisman now. I notice the files are created with rw for owner only but that is probably a umask setting for me rather than in general. Thus, you may need to do a

chmod -R og+r

to make sure files are readable.

joa-quim commented 4 years ago

No, I'm not. Double colon requires port 873, which is the one that i have closed whilst a single colon uses port 22, which I have access. The problem is that it asks the password so I can't put it on a cron job. sshpass should let me give the passw on line but it fcks too

Permission denied, please try again.
Permission denied, please try again.
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(644) [Receiver=3.1.3]
PaulWessel commented 4 years ago

The full path locally on gmtserver is /export/gmtserver/gmt/data Sorry zoom in 2 minutes.

seisman commented 4 years ago

It's a bad idea to use port 22. If you're the only one who can access the machine, you can give your password following the instruction:

       Some  modules  on  the remote daemon may require authentication. If so,
       you will receive a password prompt when you connect. You can avoid  the
       password  prompt  by setting the environment variable RSYNC_PASSWORD to
       the password you want to use or using the --password-file option.  This
       may be useful when scripting rsync.

       WARNING:  On  some  systems  environment  variables  are visible to all
       users. On those systems using --password-file is recommended.
seisman commented 4 years ago

@joa-quim This command should work for you using port 22:

rsync -a --delete'/export/gmtserver/gmt/data/*' /your/server/gmt/data
joa-quim commented 4 years ago

I don't have space for both data_6.0 and data_6.1 And it's still asking for a passwd

seisman commented 4 years ago

/export/gmtserver/gmt/data is a symlink to /export/gmtserver/gmt/data_6.1. Running the command above won't download the data_6.0.

And it's still asking for a passwd.

You're right. I believe the two methods above only works for rsync protocol (port 873).

PaulWessel commented 4 years ago

Even if you get the files over by hook or by crook, will port 22 be enough for users to get data via libcurl calls? I confess I don't know what port that is using. There must be a way to shame your IT people into opening up 873?

joa-quim commented 4 years ago

Found a solution. This works. No more passwd request.

joa-quim commented 4 years ago

So I can rsync with

rsync -a --delete --progress .

but had to stop it because it's copying both

41G    ./server/earth/earth_relief/earth_relief_01s_g



which is the same data. I don't have space for this duplication, and even if I had, it's silly.

PaulWessel commented 4 years ago

I dont understand, the first path is not under data_6.1?

joa-quim commented 4 years ago

No, I got this


PaulWessel commented 4 years ago

Here is what is actually on the server:

-bash-4.2$ pwd
-bash-4.2$ ls
cache                   earth_relief_03m.grd    earth_relief_06m.grd    earth_relief_30m.grd    gmt_hash_server_previous.txt
earth_relief_01d.grd    earth_relief_03m_g.grd  earth_relief_10m.grd    earth_relief_30s.grd    gmt_md5_server.txt
earth_relief_01m.grd    earth_relief_04m.grd    earth_relief_15m.grd    earth_relief_30s_g.grd  gmt_md5_server.txt.orig
earth_relief_01m_g.grd  earth_relief_04m_g.grd  earth_relief_15s.grd    earth_relief_60m.grd    server
earth_relief_02m.grd    earth_relief_05m.grd    earth_relief_15s_p.grd  gmt_data_server.txt     srtm1
earth_relief_02m_g.grd  earth_relief_05m_g.grd  earth_relief_20m.grd    gmt_hash_server.txt     srtm3

That you somehow get data_6.1 inside data_6.1 is a mystery. Perhaps the symbolic srtm? links which point to the earth_relief_0?s_g folder gets duplicated?

seisman commented 4 years ago

@joa-quim These two commands (note the trailing slash in command 2) do different things:

rsync -a --delete --progress .
rsync -a --delete --progress .

I believe you're mixing these two.

joa-quim commented 4 years ago

Ok, I removed data_6.1/data_6.1 and if I do

rsync -a --delete --progress .

it doesn't download anything else.

seisman commented 4 years ago

Ok, I removed data_6.1/data_6.1 and if I do

rsync -a --delete --progress .

it doesn't download anything else.

It means you already have all the files locally, right?

PaulWessel commented 4 years ago

Is that good? I mean do you have everything so this is a success, or do you mean now it does not download any of the files still missing?

joa-quim commented 4 years ago

I guess I have everything. Need now to put those under the reach of the http server

[jluis@fct-gmt data_6.1]$ du -h
108M    ./cache
28M     ./server/earth/earth_age/earth_age_01m_g
9.5M    ./server/earth/earth_age/earth_age_02m_g
9.2M    ./server/earth/earth_age/earth_age_02m_p
5.1M    ./server/earth/earth_age/earth_age_03m_g
5.6M    ./server/earth/earth_age/earth_age_03m_p
3.3M    ./server/earth/earth_age/earth_age_04m_g
3.3M    ./server/earth/earth_age/earth_age_04m_p
2.4M    ./server/earth/earth_age/earth_age_05m_g
2.5M    ./server/earth/earth_age/earth_age_05m_p
500M    ./server/earth/earth_age
319M    ./server/earth/earth_day
25M     ./server/earth/earth_mask
548M    ./server/earth/earth_night
161M    ./server/earth/earth_relief/earth_relief_01m_g
162M    ./server/earth/earth_relief/earth_relief_01m_p
41G     ./server/earth/earth_relief/earth_relief_01s_g
50M     ./server/earth/earth_relief/earth_relief_02m_g
50M     ./server/earth/earth_relief/earth_relief_02m_p
25M     ./server/earth/earth_relief/earth_relief_03m_g
25M     ./server/earth/earth_relief/earth_relief_03m_p
6.8G    ./server/earth/earth_relief/earth_relief_03s_g
15M     ./server/earth/earth_relief/earth_relief_04m_g
15M     ./server/earth/earth_relief/earth_relief_04m_p
9.5M    ./server/earth/earth_relief/earth_relief_05m_g
9.5M    ./server/earth/earth_relief/earth_relief_05m_p
1.5G    ./server/earth/earth_relief/earth_relief_15s_p
493M    ./server/earth/earth_relief/earth_relief_30s_g
521M    ./server/earth/earth_relief/earth_relief_30s_p
51G     ./server/earth/earth_relief
52G     ./server/earth
52G     ./server
57G     .
PaulWessel commented 4 years ago

Good, yes 57 Gb is it. I told UNAVCO yesterday to start their rsync so we will see.

PaulWessel commented 4 years ago

How about the the links at the top we set up in the top data directory for GMT 6.0 and the larger *.grd files that were not tiled until recently? E.g.

earth_relief_20m.grd -> server/earth/earth_relief/earth_relief_20m_g.grd

joa-quim commented 4 years ago


lrwxrwxrwx. 1 jluis jluis         48 Jun 26 18:47 earth_relief_20m.grd -> server/earth/earth_relief/earth_relief_20m_g.grd
joa-quim commented 4 years ago


GMT_DATA_SERVER                =

so I must make a symlink from /home/gmtdata/data_6.1 to a dir in, but what is the name of that dir?

PaulWessel commented 4 years ago

I am afraid I do not know exactly how the SOEST IT people link our gmtserver directory to the main SOEST website URL. ALl I know is that our forward to SOEST is like this: -->

So if we do -->

then presumably you need a symlink from the top-level of your web server called gmt/data that points to your data_6.1 directory.

joa-quim commented 4 years ago

Done, if you try to access you'll get a 403 forbidden but the data is there

seisman commented 4 years ago

Here are some instructions to setup a CTAN mirror ( It could be useful for people who want to setup a mirror but know little about web server.

seisman commented 4 years ago
gmt grdinfo @earth_relief_15m -Vd

It doesn't work for me.

gmt [DEBUG]: Download remote file for the first time
gmt [INFORMATION]: Downloading file ...
gmt [INFORMATION]: Unable to download file
gmt [INFORMATION]: Libcurl Error: HTTP response code said error
gmt [INFORMATION]: Failed to get remote file
gmt [INFORMATION]: Unable to obtain remote information file gmt_data_server.txt
gmt [DEBUG]: Download remote file for the first time
gmt [INFORMATION]: Downloading file ...
gmt [INFORMATION]: Unable to download file
gmt [INFORMATION]: Libcurl Error: HTTP response code said error
gmt [INFORMATION]: Failed to get remote file
gmt [INFORMATION]: Unable to obtain remote hash table gmt_hash_server.txt
gmt [DEBUG]: Revised options: @earth_relief_15m -Vd
grdinfo [DEBUG]: Get remote file and write to /Users/seisman/.gmt/cache/earth_relief_15m
grdinfo [DEBUG]: Download to /Users/seisman/.gmt/cache/earth_relief_15m
PaulWessel commented 4 years ago

@joa-quim do you want to be europe.* or do you think your bandwidth might be too limiting to serve the whole continent?

joa-quim commented 4 years ago

Well, let's try. So far it's the only Europe. I can ask to monitor the traffic and see if it's not too much. The speed should be ok.

PaulWessel commented 4 years ago

OK, it is set up now:


Should it be https?

joa-quim commented 4 years ago

@seisman it's only. It worked for me

Get remote file and write to c:/j/.gmt/server/earth/earth_relief/earth_relief_15m_p.grd
seisman commented 4 years ago

Still failing:

gmt [DEBUG]: Download remote file for the first time
gmt [INFORMATION]: Downloading file ...
gmt [INFORMATION]: Unable to download file
gmt [INFORMATION]: Libcurl Error: HTTP response code said error
gmt [INFORMATION]: Failed to get remote file
gmt [INFORMATION]: Unable to obtain remote information file gmt_data_server.txt
gmt [DEBUG]: Download remote file for the first time
gmt [INFORMATION]: Downloading file ...
gmt [INFORMATION]: Unable to download file
gmt [INFORMATION]: Libcurl Error: HTTP response code said error
gmt [INFORMATION]: Failed to get remote file
gmt [INFORMATION]: Unable to obtain remote hash table gmt_hash_server.txt
gmt [DEBUG]: Revised options: @earth_relief_15m -Vd
grdinfo [DEBUG]: Get remote file and write to /Users/seisman/.gmt/cache/earth_relief_15m
grdinfo [DEBUG]: Download to /Users/seisman/.gmt/cache/earth_relief_15m
grdinfo [ERROR]: Remote download is currently deactivated
grdinfo [ERROR]: Unable to obtain remote file @earth_relief_15m
grdinfo [ERROR]: Must specify one or more input files
joa-quim commented 4 years ago

https works too

grdinfo [DEBUG]: Download to c:/j/.gmt/server/earth/earth_relief/earth_relief_15m_p.grd
seisman commented 4 years ago

https also does't work for me:

gmt [DEBUG]: Download remote file for the first time
gmt [INFORMATION]: Downloading file ...
gmt [INFORMATION]: Unable to download file
gmt [INFORMATION]: Libcurl Error: Couldn't connect to server
gmt [INFORMATION]: Failed to get remote file
gmt [INFORMATION]: Unable to obtain remote information file gmt_data_server.txt
gmt [DEBUG]: Download remote file for the first time
gmt [INFORMATION]: Downloading file ...
gmt [INFORMATION]: Unable to download file
gmt [INFORMATION]: Libcurl Error: Couldn't connect to server
gmt [INFORMATION]: Failed to get remote file
gmt [INFORMATION]: Unable to obtain remote hash table gmt_hash_server.txt
gmt [DEBUG]: Revised options: @earth_relief_15m -Vd
grdinfo [DEBUG]: Get remote file and write to /Users/seisman/.gmt/cache/earth_relief_15m
grdinfo [DEBUG]: Download to /Users/seisman/.gmt/cache/earth_relief_15m
grdinfo [ERROR]: Remote download is currently deactivated
grdinfo [ERROR]: Unable to obtain remote file @earth_relief_15m
grdinfo [ERROR]: Must specify one or more input files
joa-quim commented 4 years ago


Can you see (strange, I think, does not work)

seisman commented 4 years ago

I can see http, but not https.

PaulWessel commented 4 years ago

Might as well test with a direct curl since not going to find 15m if the gmt_data_server.txt file is not downloaded first.

This downloads a copy of a file for me via curl:

curl -k -O

But for europe no:

curl -k -O % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 curl: (7) Failed to connect to port 443: Connection refused

seisman commented 4 years ago

curl -k -O works. curl -k -O fails.

PaulWessel commented 4 years ago

Doesnt for me. It quitely returns a file that says redirect or not found. Did you look at the file?

PaulWessel commented 4 years ago

Adding -L I get a file that says

/gmt/gmt_data_server.txt was not found on this server

so no /data ?

seisman commented 4 years ago

Doesnt for me. It quitely returns a file that says redirect or not found. Did you look at the file?

Redirection, too.

joa-quim commented 4 years ago

The curl -k -O seems to work but the contents is

<title>404 Not Found</title>
<h1>Not Found</h1>
<p>The requested URL /gmt_data_server.txt was not found on this server.</p>
joa-quim commented 4 years ago

but curl -k -O says

<title>403 Forbidden</title>
<p>You don't have permission to access /gmt/data/gmt_data_server.txt
on this server.<br /curl -k -O>

Have to dinner again.

PaulWessel commented 4 years ago

I can ask the SOEST people. Please tell me the full directory path name with the gmt data. For gmtserver that is


and in that directory lie the gmt_data_server.txt and all the rest.

joa-quim commented 4 years ago

Data is physically at /home/gmtdata/data_6.1 and symlinked to /var/www/html/gmt/data/

[jluis@fct-gmt data_6.1]$ ll /var/www/html/gmt/
total 0
lrwxrwxrwx. 1 jluis jluis 22 Jul  9 16:25 data -> /home/gmtdata/data_6.1
joa-quim commented 4 years ago

There are no words for this funix world. After hours and hundreds of pages I found the obscure possibility that a certain SELinux was active and then a certain chcon should be used

Another possibility for this error is that you are running SELinux (Security Enhanced Linux), inwhich case you need to use chcon to apply the proper security context to the directory. One easy way to do this is to copy from a directory that does work for example /var/www/

chcon -R --reference=/var/www /path/to/webroot

That finally did the trick of solving the permissions problem that no other tool told me about.

But it couldn't end here. Next was the problem that GMT uses https but by default the Appache servers are set by default to http.

Tons of pages later and less and less hair I finally managed to do something, but still kindly warns you that the site will probably steal and destroy your computer.

At the end, this works for me


 gmt grdinfo @earth_relief_15m -V
grdinfo [NOTICE]: Remote data courtesy of GMT data server DATA []

grdinfo [NOTICE]: Earth Relief at 15x15 arc minutes from Gaussian Cartesian filtering (28 km fullwidth) of SRTM15+V2.1 [Tozer et al., 2019].
grdinfo [NOTICE]:   -> Download grid file [1.4M]: earth_relief_15m_p.grd
grdinfo [INFORMATION]: Downloading file ...
grdinfo [INFORMATION]: Writing Data Table to Standard Output stream
grdinfo [INFORMATION]: Processing grid c:/j/.gmt/server/earth/earth_relief/earth_relief_15m_p.grd
c:/j/.gmt/server/earth/earth_relief/earth_relief_15m_p.grd: Title: Earth Relief at 15 arc minutes
c:/j/.gmt/server/earth/earth_relief/earth_relief_15m_p.grd: Command: grdfilter -Fg27.8 -D1 -I15m -rp -Gearth/earth_relief/earth_relief_15m_p.grd=ns+s0.5+o0 --IO_NC4_DEFLATION_LEVEL=9 --IO_NC4_CHUNK_SIZE=4096 --PROJ_ELLIPSOID=Sphere
c:/j/.gmt/server/earth/earth_relief/earth_relief_15m_p.grd: Remark: Obtained by Gaussian Cartesian filtering (27.8 km fullwidth) from [Tozer et al., 2019;]
c:/j/.gmt/server/earth/earth_relief/earth_relief_15m_p.grd: Pixel node registration used [Geographic grid]
c:/j/.gmt/server/earth/earth_relief/earth_relief_15m_p.grd: Grid file format: ns = GMT netCDF format (16-bit integer), CF-1.7
c:/j/.gmt/server/earth/earth_relief/earth_relief_15m_p.grd: x_min: -180 x_max: 180 x_inc: 0.25 (15 min) name: longitude n_columns: 1440
c:/j/.gmt/server/earth/earth_relief/earth_relief_15m_p.grd: y_min: -90 y_max: 90 y_inc: 0.25 (15 min) name: latitude n_rows: 720
c:/j/.gmt/server/earth/earth_relief/earth_relief_15m_p.grd: z_min: -10290.5 z_max: 6287 name: elevation (m)
c:/j/.gmt/server/earth/earth_relief/earth_relief_15m_p.grd: scale_factor: 0.5 add_offset: 0 packed z-range: [-20581,12574]
c:/j/.gmt/server/earth/earth_relief/earth_relief_15m_p.grd: format: netCDF-4 chunk_size: 131,144 shuffle: on deflation_level: 9
seisman commented 4 years ago works for me, but doesn't.

seisman commented 4 years ago

Here is a simple bash script to check the GMT servers:

for url in \
 ; do

    echo "URL: $url"
    gmt set GMT_DATA_SERVER $url
    gmt which -Ga @earth_relief_01d
    echo ""

    gmt clear data
    rm gmt.conf

Here are the output:

gmtwhich [NOTICE]: Remote data courtesy of GMT data server DATA []

gmtwhich [NOTICE]: Earth Relief at 1x1 arc degrees from Gaussian Cartesian filtering (111 km fullwidth) of SRTM15+V2.1 [Tozer et al., 2019].
gmtwhich [NOTICE]:   -> Download grid file [115K]: earth_relief_01d_p.grd

gmtwhich [NOTICE]: Remote data courtesy of GMT data server DATA []

gmtwhich [NOTICE]: Earth Relief at 1x1 arc degrees from Gaussian Cartesian filtering (111 km fullwidth) of SRTM15+V2.1 [Tozer et al., 2019].
gmtwhich [NOTICE]:   -> Download grid file [115K]: earth_relief_01d_p.grd

gmtwhich [ERROR]: Remote download is currently deactivated
gmtwhich [ERROR]: Unable to obtain remote file @earth_relief_01d
gmtwhich [ERROR]: File @earth_relief_01d not found!

gmtwhich [NOTICE]: Remote data courtesy of GMT data server EUROPE []

gmtwhich [NOTICE]: Earth Relief at 1x1 arc degrees from Gaussian Cartesian filtering (111 km fullwidth) of SRTM15+V2.1 [Tozer et al., 2019].
gmtwhich [NOTICE]:   -> Download grid file [115K]: earth_relief_01d_p.grd

gmt [ERROR]: Bad record counter in file /Users/seisman/.gmt/server/gmt_data_server.txt
gmtwhich [NOTICE]:   -> Download cache file: @earth_relief_01d
gmtwhich [ERROR]: Libcurl Error: HTTP response code said error
gmtwhich [WARNING]: You can turn remote file download off by setting GMT_AUTO_DOWNLOAD off.
gmtwhich [ERROR]: File earth_relief_01d not found!

gmtwhich [NOTICE]: Remote data courtesy of GMT data server OCEANIA []

gmtwhich [NOTICE]: Earth Relief at 1x1 arc degrees from Gaussian Cartesian filtering (111 km fullwidth) of SRTM15+V2.1 [Tozer et al., 2019].
gmtwhich [NOTICE]:   -> Download grid file [115K]: earth_relief_01d_p.grd

Two of the six URLs fail:

PaulWessel commented 4 years ago

So gmtserver will be down until tomorrow because of the non-hurricane and UH begin closed on Monday. Hurricanes, power outages etc in Hawaii only happens on weekend or days when we close, so we always have to wait extra time for fixes, such as this one. Power is on, but gmtserver is not reachable, and staff cannot fix remotely.

PaulWessel commented 4 years ago

Oh, sorry, I see it is magically back up.

PaulWessel commented 4 years ago

Maybe try again? I did the two http/s flavors for oceania and both worked.

seisman commented 4 years ago

Now I run this script:

gmt clear data
gmt which -Ga @earth_relief_01d -V

The output is:

gmt [INFORMATION]: Downloading file ...
gmt [ERROR]: Bad record counter in file /Users/seisman/.gmt/server/gmt_data_server.txt
gmt [INFORMATION]: Unable to read server information file
gmt [INFORMATION]: Downloading file ...
gmtwhich [INFORMATION]: Writing Data Table to Standard Output stream
gmtwhich [NOTICE]:   -> Download cache file: @earth_relief_01d
gmtwhich [INFORMATION]: Downloading file ...
gmtwhich [ERROR]: Libcurl Error: HTTP response code said error
gmtwhich [WARNING]: You can turn remote file download off by setting GMT_AUTO_DOWNLOAD off.
gmtwhich [ERROR]: File earth_relief_01d not found!

The first 10 lines of the gmt_data_server.txt file is:

$ head -n 10 /Users/seisman/.gmt/server/gmt_data_server.txt
# Master table with information about all the remote data sets available on the GMT Data server
# Changes related to data on the gmtserver must be added into this file before they are available to users.
# Set your editor to TAB width = 4 for alignments.
# Updated June 20, 2020
# Note: The crontab script will count non-commented lines and write that count, then append
#       the non-commented lines of this file and place it in the data directory. It is that file that is synced by users.
# Here is a longer explanation of the columns below
# Dir:      The directory on gmtserver (i.e., under /home/gmtserver/data) where files are found

It seems gives the wrong file.