Closed docuracy closed 4 months ago
@kgeographer
Some portal pages (well, at least one - I can't check others because the server is rebooting, I think) are failing to load due to malformed timespans, which come directly from the timespans
JSONField in the Place
model, for example (https://dev.whgazetteer.org/places/13970069/portal/):
... [200, 1933], [500, 1800], '[', '2', '0' ...
It seems that integer values might have been stored as strings within encoded quote marks (or possibly the square brackets and commas are enclosed in encoded quote marks, though this seems less likely).
I see what the issue is with timespans, and have partially fixed it - some values were malformed in the migration, enclosing the entire value in double quotes; got rid of those. Also many nulls got transformed to '"\N"' and similar - fixed that. There are 183 records that have a mix of integers and quoted strings in the arrays - so the arrays are valid...that's how they are in v2. Ok to remain?
e.g. [[-600, 600], ["135", "250"]]
and [[-9999, -1000], ["135", "250"], [1525, 1525]]
I'll have to speak to the developer about normalization :^P
KG: slow load on server, not locally so it's the data. removed collections from featured
status, no help. the browser network console shows 'pending' [figs below]. best to wait & see whether adding collection data solves it.
re: basemaps, only the home and gallery pages must have 'bare topography' - areas and tasks can also, no need for context in those - well maybe countries would be nice to see
on any that offer a zoom to feature, there has to be a buffer - most zoom too far in by far
re: basemaps, only the home and gallery pages must have 'bare topography' - areas and tasks can also, no need for context in those - well maybe countries would be nice to see
on any that offer a zoom to feature, there has to be a buffer - most zoom too far in by far
There are now 3 different map styles in use:
If it becomes apparent that anything is annoying on the areas and tasks pages, it's easy enough to define an additional style. And next time I'll be sure to avoid spurious trailing commas in JSON lists (which are legal in Python data structures): the tileserver doesn't do dodgy JSON.
There's now a global defaultZoom
variable set to the value 8
in base.js
, which is applied as a maximum zoom parameter wherever the whg_maplibre.fitViewPort
method is called - except on the home page and in carousels. It seems to me to be about right in most contexts, but can now easily be adjusted.
The codebase was using a UMD version of spin.js
, which according to the documentation "is only available as a temporary workaround".
It is now (with the next PR) using the recommended ECMAScript module, bundled with Webpack and extended in base.js
to allow initialisation and stopping with simple JQuery selectors. Scaling is determined automatically based on the size of the target container.
Examples:
$('#drftable_list, #detail').spin(); // Start spinners on single or multiple elements - can also use `.class` selectors
$('#detail, .maplibregl-control-container').stopSpin(); // Stop existing spinners on single or multiple elements
_cull
method (remove smallest instead of random) and increase MAX_ENTRIES
mapdata
functionality (including cache management) to utils.mapdata
mapdata
generation for each of 3 data types with focus on efficiency and speed?variant=tileset
mapdata should also create cache of geometry-less mapdatamapdata
should provide centroids only if any Collection.vis_parameters facet has a true
trail
property.needs_tileset
and trigger mapdata_task
(for either 'tileset' or 'standard') on change of publication status - test on DEV with these (which otherwise timeout on attempted page load):~ [Moved to new Issue #227]
featured
flag is changed)~ #235 vis_parameters
, JSON in database enclosed in double quotes [@kgeographer: had these been set manually in the database, or is there a programmatic bug?]
/app/utils/cluster_geometries.py
fails for small datasets, with "n_samples=7 should be >= n_clusters=10"null
geometry due to invalid extent
null
extenttiler.js
to prevent attempted tileset generation.no-results
container should hide Drawing and Temporal filters and Fullscreen control.~ Need to make use of spinner instead of clearing results while temporal and spatial search updates are being made.Optimisation of datasets.utils.fetch_mapdata_ds
still results in very long processing time, and so I have now implemented Django's FileBasedCache
for this function and for collection.views.fetch_mapdata_coll
. Server response times are reduced drastically after the first visit to a page:
Chinese Counties: 5.53s -> 29.53ms Chinese Towns: 55.57s -> 515ms Pokémon: 352ms -> 8.87ms
I've set the cache system (in settings.py
) with no expiry, meaning that the functions will need to be called with ?refresh_cache=true
whenever the dataset or collection is updated, or whenever it has a tileset generated or removed. Calling them on first publication would of course remove the potentially-long wait on a first visit to the associated page.
Apart from evident performance gains, this will reduce the need to upgrade server processing and memory capacity. The disk capacity cost of maintaining a set of cache files (just one for each Dataset/Collection) is probably slight, but my security clearance level does not permit me to access the (new) /cache
directory to assess the size of the generated files.
Modifications are currently running on DEV in a hotfix branch. I'll copy them to my local branch and make a PR in due course if you're satisfied with them, @kgeographer.
I've identified the problem that was causing Portal pages to fail to load properly. If the places had only one timespan between them the histogram generator class (in historygram.js
) would crash (silently, for some reason). Hotfix applied (no histogram for such cases).
The class would also fail (and the page stall) if the total range exceeded 500 years: that's also hotfixed.
Incidentally, London in the DK dataset includes coordinates on both sides of the Atlantic, wrongly implying that they are the same place. The gloss includes metadata extracted from the map layers at the centroid of these points, mid-Atlantic.
After much trial-and-error hot-fixing, I've identified the source of the sometimes > 60-second delay in loading Place Detail pages, on the Dataset model:
@property
def minmax(self):
# TODO: temporal is sparse, sometimes [None, None]
timespans = [p.minmax for p in self.places.all() \
if p.minmax and len(p.minmax) == 2 and p.minmax[0]]
# print('ds; timespans as model property', self.label, timespans)
earliest = min([t[0] for t in timespans]) if len(timespans) > 0 else None
latest = max([t[1] for t in timespans]) if len(timespans) > 0 else None
minmax = [earliest, latest] if earliest and latest else None
return minmax
I've replaced it with this, which deals with the #TODO and is much more efficient:
@property
def minmax(self):
minmax_values = Place.objects.filter(dataset=self).aggregate(
# This ignores `None` values, effectively handling temporal sparsity [None, None]
earliest=Min('minmax__0'),
latest=Max('minmax__1')
)
earliest = minmax_values['earliest']
latest = minmax_values['latest']
return [earliest, latest] if earliest and latest else None
Server response time for https://dev.whgazetteer.org/places/5907944/detail is reduced from 62 seconds to 1.05 seconds.
Search from Home Page was broken, because after a successful search and redirection to the Search Page, search.js
was not loading results from the last_search
in LocalStorage. This seems to be due to tightening of default browser security policies, disallowing access to document.referrer. There were also problems with the way fclasses
were being (mis-)handled. Hotfixes copied to my current branch.
FYI, @kgeographer, some of the trace data (well, one trace in one collection that I know of, number 76) has a zero day number in a yearstring (which was breaking fetch_mapdata_coll
). I've modified collection.views.year_from_string
to handle it, but you might want to parse incoming data to eliminate things like that.
Bug found when examining test dataset locally: even if this is just a test data problem, code should be made more robust to handle this eventuality.
http://localhost:8001/api/place/25195/
Traceback (most recent call last):
File "/py/lib/python3.10/site-packages/django/core/handlers/exception.py", line 56, in inner
response = get_response(request)
File "/py/lib/python3.10/site-packages/django/core/handlers/base.py", line 197, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/py/lib/python3.10/site-packages/django/views/generic/base.py", line 103, in view
return self.dispatch(request, *args, **kwargs)
File "/py/lib/python3.10/site-packages/django/views/generic/base.py", line 142, in dispatch
return handler(request, *args, **kwargs)
File "/app/api/views.py", line 1197, in get
"types": add_urls(sort_unique([type for place in serialized_places for type in place["types"]], 'label')),
File "/app/api/views.py", line 1130, in sort_unique
item_unique = item[key] if key else json.dumps(item)
Exception Type: KeyError at /api/place/25195/
Exception Value: 'label'
carouselMetadata
from Home Page context/HTML and fetch with AJAX.~/task_progress/
intileset-tools.js
.null
entry among the "types" in the.gloss
./app/utils/cluster_geometries.py
is prone to causing Out of Memory errors. Switched from Agglomerative to KMeans clustering: seems quicker locally, but needs to be tested with large datasets on dev server.v30a4
portal.js
whg-modal.css
home.bundle.js:985 Uncaught ReferenceError: csrfToken is not defined at initiateSearchHome (home.bundle.js:985:20)
addedwindow.csrfToken = "{{ csrf_token }}";
https://dev.whgazetteer.org/main/datasets_list
is not paginated, is it, so can't the table be sorted in the browser without an additional call to the server? kg: this is quite a convoluted setup, loading '_list' pages into the dashboard_admin page. Tricky because each tab has different search/filter requirements. I think doing something else is a major refactor. This page is seen only by the team, so a spinner should be fine for now; low priority pls defer~null
images - do not include these in the.ds-card-blurb
divs.search/suggestions
which return an error).