adamreichold / umwelt-info

umwelt.info metadata index
https://umwelt.info
GNU Affero General Public License v3.0
1 stars 0 forks source link

Start adding scripts to make ad-hoc analyses on the metadata catalogue. #73

Closed adamreichold closed 2 years ago

adamreichold commented 2 years ago

Such analyses can for example be used to make decisions on software optimizations like in the second commit included here.

The analysis itself finishes in about 10 seconds when running against a local release build of the server and a catalogue of 65k datasets:

> time python3 resources.py 
1: 36.0 %
2: 59.4 %
3: 69.0 %
4: 78.1 %
0: 87.0 %
5: 94.5 %
13: 96.1 %
6: 97.1 %
8: 97.6 %
9: 98.1 %
...
44: 100.0 %
57: 100.0 %
84: 100.0 %
61: 100.0 %
65: 100.0 %

real    0m8,661s
user    0m1,306s
sys     0m0,043s

Fetching the same amount of data over an SSH tunnel from our server takes a bit more than minute and produces between 20% and 30% CPU utilization on that server.