Closed ChristinaLast closed 6 months ago
Great. @andrewphilipsmith Would love to review these when you are ready to discuss options!
Below are various thoughts and questions about the visualisation for the tutorial. I'm still checking if the answers already exist in places I've not spotted yet, but I have written them here anyway.
Assuming that the csv files are typical of MapReader's output, I think there is a case for doing some post-processing of the result to get them in a format suitable for visualisation (which might be different from a format that is suitable for further analysis).
Some non-exclusive options:
Creating a notebook that does this post-processing would be possible (I've made a start on this). Some of the input parameters of MapReader would need to be accessible (notably to the patch size/geometry).
I've had an initial play with leafmap
, which I'm sure is capable of doing what we require (at least within the tutorial's scope). However, I'd want a better understanding of the issue above before making a final recommendation.
- Are we expecting users to work through the tutorial using their own data or with sample data that we provide?
For a tutorial, it will be with the provided sample data (e.g. the 1-inch OS maps).
But we also need to include clearer instructions for how to prepare input for MapReader when people want to bring their own maps to the tool. E.g. which kinds of maps work best, how many, in what format. This is part of the README update I am planning.
- Does the tutorial cover using MapReader at scale? Applying MapReader to hundreds of map sheets (with overlapping margins) presents different visualisation challenges to applying it to a single example map sheet.
It should yes. MapReader isn't really useful for working with 1 map. It's only worthwhile at scale, e.g. more than 200 or so large-scale series maps (though of course this will vary)?
- Does the tutorial cover using MapReader for non-map images (e.g. plant phenotype)?
Shortly, I think we will separate the map and the non-map applications for this code. So, the non-map related content will have its own separate tutorial(s). Does that sound right @kasra-hosseini ?
@andrewphilipsmith Thanks. Please see my inline comments:
Are we expecting users to work through the tutorial using their own data or with sample data that we provide?
I just saw @kmcdono2 's reply. I totally agree with this:
But we also need to include clearer instructions for how to prepare input for MapReader when people want to bring their own maps to the tool. E.g. which kinds of maps work best, how many, in what format. This is part of the README update I am planning.
Does the tutorial cover using MapReader at scale? Applying MapReader to hundreds of map sheets (with overlapping margins) presents different visualisation challenges to applying it to a single example map sheet.
Does the tutorial cover using MapReader for non-map images (e.g. plant phenotype)?
Are the result data and the map images available in a common CRS? The MapReader paper mentions reprojecting the map-sheets. So far, I've been pulling the NLS images from their tile server (in Web Mercator) and plotting the patch results in WGS84. Reprojecting that much data on the fly will always be painful, whatever the visualisation tool.
Are the patches available as polygon files, or can they be generated procedurally? (e.g. the inset to (d) in this figure https://user-images.githubusercontent.com/1899856/144105429-f4f02d49-7b2a-4cdb-ae57-19d077aab713.png). The paper mentioned that by default, MapReader uses fixed pixel extent of the map-sheet - so are they actually square in real-world coordinates?
so are they actually square in real-world coordinates?
(Sorry, I have to go to a co-working session now, but I will try to go through your questions in the afternoon)
- I am now thinking maybe it is better to reproject the map sheets after retrieving them from NLS. What do you think?
I think this is a good idea. We want to simplify whatever people need to do to see the data.
However, can we make it easy to accommodate different CRS? E.g. diff map collections?
I think there is a case for doing some post-processing of the result to get them in a format suitable for visualisation (which might be different from a format that is suitable for further analysis).
Convert the csv files to a spatially-indexed storage format so that at high zoom-levels only the required points need to be read from disk. Interpolate a raster image from the points. A reasoned and repeatable rule-based approach could be used for overlapping patches that have been categorised differently. The resulting image could be tiled, so it is suitable for display at multiple scales. (Switching to vector points/patches for high zoom-levels would still be appropriate). If we (a) exclude the unclassified patches and (b) dissolve the classified patches, then we might have the data in a form when suited for vector tiles, which is a format well suited for displaying large datasets at a range of scale.
Creating a notebook that does this post-processing would be possible (I've made a start on this).
Some of the input parameters of MapReader would need to be accessible (notably to the patch size/geometry).
I've had an initial play with leafmap, which I'm sure is capable of doing what we require (at least within the tutorial's scope). However, I'd want a better understanding of the issue above before making a final recommendation.