Closed Esteban82 closed 1 year ago
I think we still have a conversation of what to do with tests. Having 60 tests fail each time we update earth_relief means lots of extra work on growth of the dvc system. I think @maxrjones and I were discussing maybe using a test server when running the tests so that we would get a stable set of remote grids to avoid this problem. These files would ideally not be mirrored across to all servers.
I think that this issue can be closed.
OK, woken on this but copying from Hawaii to Oslo is a, well, bit slow!
I may wish to rename the test.generic-mapping-tool.org
entry to next or candidate or some other word than test. The data in test are not for testing but candidate data sets for the next release. The word "test" should retain how we use it in all of GMT (test dir, test scripts, etc) and hence setting GMT_DATA_SERVER = test would be what we do to run all the tests using the reference data set. GMT_DATA_SERVER = next would (now) give access to the unreleased files like venus and mars.
I may also need to come up with another dir structure on the user's computer: .gmt is what GMT uses. However, if you are I want to try venus we dont want GMT to place venus under the .gmt dir but probably use a separate dir, like ~/.gmt-test and ~/.gmt-next. I have accidentally overwritten stuff in ~/.gmt many times because of forgetting to move .gmt out of the way.
Any preference for next vs candidate? I dont want to use release-candidate since it is long.
Would this be OK with you? I would need IT to change the test->gmtserver setting.
I may wish to rename the
test.generic-mapping-tool.org
entry to next or candidate or some other word than test. The data in test are not for testing but candidate data sets for the next release. The word "test" should retain how we use it in all of GMT (test dir, test scripts, etc) and hence setting GMT_DATA_SERVER = test would be what we do to run all the tests using the reference data set. GMT_DATA_SERVER = next would (now) give access to the unreleased files like venus and mars.
I prefer to candidate which warns users not to use it.
I may also need to come up with another dir structure on the user's computer: .gmt is what GMT uses. However, if you are I want to try venus we dont want GMT to place venus under the .gmt dir but probably use a separate dir, like ~/.gmt-test and ~/.gmt-next. I have accidentally overwritten stuff in ~/.gmt many times because of forgetting to move .gmt out of the way.
Is it possible to change it via enviromental variables like GMT_DATADIR
or GMT_USERDIR
?
Yes, GMT_USERDIR=~/.gmt-candidate will use that dir for all "./gmt" work (look for data, create session). So we can let our remote-data test scripts set that, for instance.
OK, candidate is a good name.
Since our hope is that Geoscope can be persuaded to take over the hosting I want to have a clear structure on oceania first. Right now the top gmt directory looks like this:
-bash-4.2$ pwd
/export/gmtserver/gmt
-bash-4.2$ ls -l
total 28
drwxrwxr-x 2 pwessel gmt 4096 May 21 2020 BlackMarble
drwxr-xr-x 2 seisman gmt 286 May 11 2020 BlackMarble2016
drwxrwxr-x 2 pwessel gmt 4096 May 21 2020 BlueMarble
drwxrwxr-x 2 pwessel gmt 33 Aug 11 2021 LOGS
lrwxrwxrwx 1 pwessel gmt 8 Jan 28 2022 data -> data_6.2
drwxrwxr-x 4 pwessel gmt 4096 May 29 2020 data_6.0
drwxrwxr-x 5 pwessel gmt 4096 Jan 28 2022 data_6.1
drwxrwxr-x 5 pwessel gmt 4096 Aug 15 03:00 data_6.2
drwxrwxr-x 9 pwessel gmt 198 Aug 3 08:00 gmtserver-admin
drwxrwxr-x 2 pwessel gmt 4096 Oct 17 2019 old-earth-reliefs-v1
drwxrwxr-x 2 pwessel gmt 4096 Mar 15 2020 old-earth-reliefs-v2
drwxrwxr-x 4 pwessel gmt 87 Apr 30 2022 static
drwxrwxr-x 4 pwessel gmt 261 Jul 29 01:51 test
Only the directory data (i.e., _data6.2) is mirrored or used for reading data by users. Currently, test is where new candidate data should be placed until we release them. SOEST IT helped us set things so that oceania.generic-mapping-tools.org
points to the data dir contents while test.generic-mapping-tools.org
points to the test dir contents, Here are proposed steps:
Comments allowed!
Since our hope is that Geoscope can be persuaded to take over the hosting
Do you mean "EarthScope" (https://www.earthscope.org/)?
- I see no point keeping the _data6.0, _data6.1 directories as gmtserver just fills up.
Yes, actually the name "data_6.2" also makes no sense to me. We should simply name it "data", without any version string.
The problematic new synbath dataset in https://github.com/GenericMappingTools/gmtserver-admin/issues/213 warn us to always backup the old dataset before updating. I think we should have four directories:
data
: the directory for the "oceania" servercandidate
: the directory for working on the dataset updates test
: the test server that will be used by GMT testsolddata
(or other names): the directory that contains the old serverWhen we update an existing dataset or adding a new dataset, we should first copy them to the candidate
directory. If the dataset looks good, then we should move the dataset from "data" to "olddata" before copying them from "candidate" to "data". It's a little complicated, but it makes sure that we can quickly revert back to the correct "old" dataset if the updated dataset have issues (like #213).
- A directory (maybe called reference or read-only or testdata) will be placed in candidate so that these are not mirrored anywhere.
I don't understand this point.
- I will make some simple changes in gmt_remote.c so that when server is test then it inserts a /reference or similar string in the URL so we fish from the reference directory when running our test in CI or locally.
I'm a little confused about this point, too. From your points 7 and 8, are you suggesting adding a reference
directory in both candidate and test server? What are the files in the reference
directory?
Yes EarthScope sorry. We do them lots of favours with the GMT for Geodesy course so they better help us out.
Yes, data_6.2 serves no function either so we can eliminate that as well and just have data.
The complication re 7-8 has to do with some limit of what SOEST can do. Right now we have two redirects (oceania and test) to different subdirectories on the same server (gmtserver.soest.hawaii.edu). It would of course be cleaner if we can simply add 2 more: candidate and old data. Then we would not have to hunt for directories inside others, etc. Let me as IT if that is possible (cleaner for us) or what the issue is.
I notice that something earlier we had IT set up static.generic-mapping-tools.org for the purpose of using the static reference data that wont change when we do tests. Since already set up we can have oceania, static (many take over for test) and add candidate? I dont think we need old data.generic... since a make file will do stuff like moving things around on the server.
Didn't this use to work?
curl -L data.generic-mapping-tools.org:/gmt_data_server.txt
Sorry, these works but have to do it right:
curl -L http://data.generic-mapping-tools.org/gmt_data_server.txt
curl -L http://static.generic-mapping-tools.org/gmt_data_server.txt
curl -L http://test.generic-mapping-tools.org/gmt_data_server.txt
So if we agree that the old data or similar does not need to be accessible (we will use cp -f etc on the server via makefiles) we have the three we need?
Since already set up we can have oceania, static (many take over for test) and add candidate?
Sounds good to me.
It seems that oceanic, candidate and static are all online now. Does this mean we can remove the old earth_relief_2.1 directory from the main server?
Let's hear from @Esteban82 who has done a lot of the work. @federico, what remains, if anything on the server(s)? My though was that we placed all the new and updated stuff on candidate, then when 6.5 is released we update oceania with everything on candidate.
@seisman, you may have answered this before, but cannot recall:
When the CI runs all the tests, from where does it pull @earth_relief_XXXX etc? Current oceania or some cached files before any updates? I am hoping we can put what is compatible on the static server and then we need to pass static.generic-mapping-tools.org when running tests. This will eliminate any data differences and then we can focus on actual failures - it is too hard to get 70-80 failures and many are data driven,
We have a workflow that downloads the GMT remote data from the oceania server and stores them as GitHub Action cache files. The cache files will then be used by the tests.
OK, so this happens each time the full test is run? I.e., the cache is updated at that time so if I test with oceania we are using the same version?
No, the workflow is scheduled to run once every week (https://github.com/GenericMappingTools/gmt/actions/workflows/ci-caches.yml). If you make any changes to oceania, then we have to manually trigger the workflow to update the caches. After that, you and the CI will use the same version.
Wrote a script that determines which remote files are used in our doc and test scripts. Got these 35 (some are the same since without registration we default to _p:
@earth_age_02m
@earth_age_02m_p
@earth_age_06m
@earth_age_06m_p
@earth_age_10m
@earth_age_10m_p
@earth_day_01d
@earth_day_01m
@earth_day_01m_p
@earth_day_15m
@earth_relief_01d
@earth_relief_01d_g
@earth_relief_01m
@earth_relief_01m_p
@earth_relief_02m
@earth_relief_02m_p
@earth_relief_03s
@earth_relief_04m
@earth_relief_04m_p
@earth_relief_05m
@earth_relief_05m_g
@earth_relief_05m_p
@earth_relief_06m
@earth_relief_06m_p
@earth_relief_10m
@earth_relief_10m_g
@earth_relief_10m_p
@earth_relief_15m
@earth_relief_15m_p
@earth_relief_20m
@earth_relief_20m_g
@earth_relief_30m
@earth_relief_30m_p
@earth_relief_30s
@earth_relief_30s_p
Perhaps we should just place these on the static server (copy from oceania for now) and see how that goes with testing that server? Also, @joa-quim has a point that why use 01m tiles unless the bug is specific to tiling or high-res tiling. If it works for 01m, 05m and 06m then we should simplify the tests and use 06m instead. This means updating some PS files in DVC. The doc scripts and examples may use what they use since we want nice images and not blurry ones.
Perhaps we should just place these on the static server (copy from oceania for now) and see how that goes with testing that server?
Sounds good to me.
Let's hear from @Esteban82 who has done a lot of the work. @federico, what remains, if anything on the server(s)? My though was that we placed all the new and updated stuff on candidate, then when 6.5 is released we update oceania with everything on candidate.
Yes, I think that we can delete from:
Great, can you take care of removing those? Then candidate is fully loaded (venus, moon etc)?
Great, can you take care of removing those?
From both servers, right?
Then candidate is fully loaded (venus, moon etc)?
Yes
Yes, from oceania and candidate since we don't reference 2.1 anywhere
@PaulWessel you will have to delete earth_relief2.5
-bash-4.2$ rm -r earth_relief2.5/
rm: cannot remove ‘earth_relief2.5/earth_relief_03s_g/N44E004.earth_relief_03s_g.nc’: Permission denied
rm: cannot remove ‘earth_relief2.5/earth_relief_03s_g/N44E003.earth_relief_03s_g.nc’: Permission denied
rm: cannot remove ‘earth_relief2.5/earth_relief_03s_g’: Directory not empty
I deleted earth_relief2.1
in both sites.
Thanks, and sorry, I see the 3s had incomplete permissions for the group...
I deleted
earth_relief2.1
in both sites.
We delete like 51GB from the server. Joaquim must be happy.
I think we should also remove earth_relief2.1 and 2.5 from the test as well. It is better to leave it tidy
Yes, I think test is just for crazy experiments with new things until that dataset is stable and can go on candidate. So please clean!
So please clean!
I deleted earth_relief2.1. For the 2.5 I don't have group permissions.
OK everything under test should now have group permissions rw, so you should be able to delete
I have a doubt. Should I delete that directory or everything?
You mean "test" itself? No, let that one sit empty but server etc goes
Ok, so I will delete everything inside the server directory.
/export/gmtserver/gmt/test/server/**
Done. But I still can't deleted these files.
-bash-4.2$ pwd
/export/gmtserver/gmt/test/server
-bash-4.2$ rm -r *
rm: cannot remove ‘earth/earth_relief2.5/earth_relief_03s_g/N44E004.earth_relief_03s_g.nc’: Permission denied
rm: cannot remove ‘earth/earth_relief2.5/earth_relief_03s_g/N44E003.earth_relief_03s_g.nc’: Permission denied
rm: cannot remove ‘earth/earth_relief2.5/earth_relief_03s_g’: Directory not empty
Should I aslo delete the files within /export/gmtserver/gmt/test/cache
?
Yep
On 9 September 2023 at 16:44:47, Federico Esteban @.***) wrote:
Should I aslo delete the files within /export/gmtserver/gmt/test/cache ?
— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/GenericMappingTools/gmtserver-admin/issues/159*issuecomment-1712528339__;Iw!!PvDODwlR4mBZyAb0!XnDaca2mZbSbWktDXwY6iXfjmrUuaMQegEmoB_nSZm_m761EGF5Jmk5EC5Hy-t9MR_rLxvm-3MeR--zYJBX1dX_SGg$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AGJ7IX3CSJ6NXAR65FOJBPTXZR6F7ANCNFSM567FEQVQ__;!!PvDODwlR4mBZyAb0!XnDaca2mZbSbWktDXwY6iXfjmrUuaMQegEmoB_nSZm_m761EGF5Jmk5EC5Hy-t9MR_rLxvm-3MeR--zYJBXs-q9VJw$ . You are receiving this because you were mentioned.Message ID: @.***>
Just deleted the files in cache
.
I think we can close the issue.
I was looking in the test server and I found these two folders. I think that the first is an old version (2.1) of the SRTM15, and thus could be deleted.
http://test.generic-mapping-tools.org/server/earth/earth_relief2.1/ http://test.generic-mapping-tools.org/server/earth/earth_relief/