Open MKIM1008 opened 8 months ago
I've tried some methods to derive coordinates for the SMEs simple point map (1,113,937 cases).
-> e.g. OSM doesn't have enough addressesleaving lots of empty coordinates and Google Maps API should be the paid one to cover all lists including the huge cases in Loughborough, E.Midlands and the whole UK.
@polly64 I will simply draw the point map using OS Code Point (postcodes) first next week, to see the brief spatial density and discuss how to map the list in OSMM + UPRNs with @matkoniecz. It may be similar to the recent UPRN discussion Mateusz brought up.
@MKIM1008 Great . Just to say We just can't derive any data from osmm that is then made available in our open downloads. So if we map manufacturing unfortunately it can only be done using open products
@polly64 Loughborough SMEs map with Postcode:
Each postcode on the map ignores the overlapped locations of SMEs. Heatmap or other analysis (colouring with SIC code) will be proceeded.
There are only 4-5 businesses in the three postcode. It looks a big shopping mall. We know the mall is sourrounded by high streets. Streetview is 2018 and therefore we cannot use it here
Hi @MKIM1008 could you possibly add some notes/edit the ones i have made below, for Kate/ Cambridge @KatePT @taimaz22 issues regarding the SME issues with mapping.
Hi @matkoniecz could you give @@KatePT access to editing and so i can assign stuff - many thanks
SME and Manufacturing maps of Britain Notes on planning and current problems we are attempting to resolve:
SME/SIC codes database: We would ideally like to link every building in Britain to the SIC database which also provides data on SMEs so that all non-domestic activities can be mapped in a way that means. The Standard Industrial Codes (SIC) codes are very comprehensive and MKIM is now comparing the database with the ISIC international codes. @MKIM1008 could you add the total number of categories here and the government link, and any other info that's relevant? - thanks. Advantages are: a) a very detailed description is provided e.g wholesale fishmongers, retail fish supplier, fish canning, fishing boat storage facilities etc. b) regularly updated by government c) available for whole of Britain c) looks like able to be compared across countries via similar ISIC codes (we are comparing now) d) does not differ significantly in terms of te CCRP's current highest tier therefore a minimal amount of collected data will be lost. Mihyun could you take a screenshot of SIC excel file headings and add here? thanks
Problems with mapping SME SIC codes we have identified a) MKIM used OpenStreetMap to obtain open XY coordinates to join with open OS UPRNS, but OSM currently lacks the addresses. @MKIM1008 can you add why addresses are also needed as well as XY coordinates? b) Map coordinates/UPRNs are not provided in the SME/SIC file but addresses are b) Mapping against OS open code point only allows geolocation at postcode level which is too genertalised c) As image above also shows there is an issue with all SME's being displayed at postcode level- as there are many more known to be there than shown when points are clicked - MKIM is investigating d) Google street view images for Loughborough are very out of date (2018), whereas for Oxford Street London they are only 8 months old. This means crowdsourcing using streetview images whcih we were planning to do as a test for the festival will onot be possible. This will have to be done manually using a phone, or through a call out to residents, if no automated way of populating SME's/manufacturing can be identified. e) the SIC file is very large . Ideally we would like to link to an API. MKIM is investigating this but we think it might only allow download of one item at a time.
Need more sources to data a) @MKIM1008 can you list any other sources you are using here, thanks b) @KatePT if you can add any more sources of SME and manufacturing data you know about here that would be great. MKIM is also talking to Loughborough's Business dept and to the Chamber of Commerce. c) The main thing is to first understand what we do and don't have. Once we know we are missing something we can advance on a mapping strategy to fill the gaps and identify key people/government departments we may need to talk with. Do you know if Make UK's database is open? A key aspect is the need for regular updating. We would also like to look at the vacant and derelict mapping with a colleague working in this area
@matkoniecz can you tell me what we need to have in an SME/Land use database to allow us to map to building- is it just centroid coordinates (which can be found in the open UPRNs? If the address and building number is there but not coordinates can we still not map because we have no way of linking comprehensive addresses these to the polygons?
@polly64 SMEs list from Companies House without UPRNs / UK level / SIC code @MKIM1008 could you add the total number of categories here and the government link, and any other info that's relevant? : In SMEs list: 53 columns, https://download.companieshouse.gov.uk/en_output.html In SIC code: 21 categories, 731 sub-categories, https://resources.companieshouse.gov.uk/sic/
@MKIM1008 can you add why addresses are also needed as well as XY coordinates?. : The addresses are being used to obtain XY coordinates for mapping in GIS. These XY coordinates will then be used to link the OS Master Map with UPRNs.
Business rates from the council with UPRNs / Council level / Business description @MKIM1008 can you list any other sources you are using here, thanks : I have a dataset on National Non-Domestic Rates (NNDR) that includes UPRNs. This full list is compiled by each council and could offer annual live-streaming; to track of business closures, openings, and changes over time by their tax assessment. Limitations: 1) the document includes empty properties. 2) the description looks from the SIC code but generalised-> However, the specific SIC code can be joined by the company name, etc. if necessary.
https://www.charnwood.gov.uk/pages/foi_request_business_rates
I will contact with the council business team to ask about the business classification
Other references (ongoing) Non-domestic rating: Stock of properties including business floorspace https://www.gov.uk/government/collections/non-domestic-rating-stock-of-properties-collection#2023
Business rates at lower levels of geography in England and Wales https://www.ons.gov.uk/businessindustryandtrade/business/activitysizeandlocation/articles/businessratesatlowerlevelsofgeographyinenglandandwalesresearchupdate/november2022
@MKIM1008 thanks so much could you possibly jot down some things below just so I am super clear: i) could you list problems here with accessing right open address data to enable us to map manufacturing/SMEs? ii) could you give all open address sources at Loughborough and Britain scale we could use, here with a note on any issues with each? iii) With the OS OML/INSPIRE merge where @matkoniecz is adding open UPRNs, how do we link the SIC codes from the companies house dataset? iv)Why are we having to link the companies house data at post code level? v) Is there an API for the companies house dataset? vi) did you find an Further education (FE) college dataset we could map?
@polly64
Here is the list of problems and possible solutions to enable us to map manufacturing/SMEs at building level using both OSMM and the new open Inspire footprint merge.
1) Address data (i) We now have accessed to enable mapping of SMEs/manufacturing from the Companies House dataset and NNDR dataset. These are both released under Open Government Licence v3.0.
(ii) An open Python package can be used to derive XY (EPSG: 4326, Global) or Longitude-Latitude (EPSG: 27700, UK) coordinates from OpenStreetMap addresses: Nominatim is the name of the package.
(iii) We haven't yet found any other comprehensive open address sources for Britain or E.Midlands
2) Manufacturing / SME data (i) Companies house data has SIC code + addresses but without UPRNs.
(ii) National Non-Domestic Rates (NNDR)/business rate data
(iii) Companies house data has a longer list of SMEs than the NNDR data
e.g. For Loughborough, around 4000 businesses in Companies house data / 800 businesses in NNDR data
[ ] @MKIM1008 to explore why there is such a difference between the Companies House and NNDR datasets
[ ] @MKIM1008 to identify what Companies House data is derived from (e.g. NNDR data from the tax assessment each year)
[x] @MKIM1008 to map and join the NNDR and Companies House datasets in QGIS.
[x] @MKIM1008 to join the datasets by UPRNs (found only in NNDR) which will enable us to visualise specific SIC codes at building level (because the SIC codes are described in the Companies House data not in NNDR)
[ ] @MKIM1008 to identify whether NNDR land-use activities are described differently within the Companies House dataset at the local council level
(iv) The way the land use activities data from the council (MyProperty) is structured means that UPRNs
3) Updating SME data using an API
4) Adding the location of Further education (FE) colleges
(We can use the Church of England to obtain the data. They said the data will not be provided in Excel. https://www.arcgis.com/home/webmap/viewer.html?webmap=67bce0ed36dd4ee0af7a16bc079aa09a&extent=-0.4317,51.313,0.1986,51.5945)
5) Layers
@polly64
We have been granted access to the highstreet footfall and business sales data systems. While these datasets are not open data, public organisations (Love Loughborough and Charnwood Borough Council) have authorised us to visualise this data on our platform. Could we use this data for Colouring Loughborough as layers?
@MKIM1008 let's experiment
SIC (UK) might have more categories within sectors that are crucial to the UK economy, such as financial services, oil and gas production, or specific types of manufacturing. ISIC, aiming for international applicability, might have broader categories that can encompass a wide range of activities but with less specificity to any particular country's economic structure
@matkoniecz add cambridge inspire open merge polygons for Cambridge festival so we can show a) international (polly to make film) b) national and regional- @mdsimpson42 to add admin and infrastructure layers c) local - @MKIM1008 to add manufacturing data to OSMM for Loughborough on live site local2- @matkoniecz to add INSPIRE open merge for Cambridge or Cambridge city centre for residents to test on staging
If we do we can use other Python packages to extract the coordinates from any other open platforms like OpenStreetMap. @matkoniecz do you have any comments on this?
1) coverage may be lower than needed
2) https://www.openstreetmap.org/copyright and https://osmfoundation.org/wiki/Licence/Community_Guidelines/Geocoding_-_Guideline would apply
@matkoniecz and @mdsimpson could you add a link to the Companies House database
where?
Hi @MKIM1008 can you give a bit more info on this - do we still need?
- [ ] @MKIM1008 to map and join the NNDR and Companies House datasets in QGIS.
@polly64.
@taimaz22 mentioned the SMEs in Loughborough are expected around 600 cases. The NNDR dataset includes 800 cases in Loughborough so it looks better than the Companies House data (approx. 4000 cases).
The NNDR list will be compared with Love Loughborough's SME dataset of 600 cases. - they are the local Business Improvement District manager (?).
[NNDR in OSMM - 57% UPRN coverage in Charnwood, 2842 out of 4975] The map will be being updated here. (584 cases)
The Land Use classification will be asked to the council.
super helpful @MKIM1008 thanks for spotting duplicates we should inform council
@polly64
Manufacturing / SME map for Loughborough (list shown in NNDR data)
The list of FE colleges in the UK https://www.gov.uk/government/publications/further-education-colleges-in-the-uk
FE colleges map https://www.aoc.co.uk/about/college-directory
DEC dosen't support FE colleges data fully.
@polly64
I've gathered NNDR lists from the 40 councils' websites in the East Midlands. - because gov.uk doesn't provide all and very recent datasets. However, only a few councils provide complete lists of non-domestic properties, but these often lack UPRNs (only North Kesteven District Council has UPRNs in their NNDR data).
I can ask them to open the NNDR list with UPRNs (supported by Freedom of Information Act, similar to Charnwood case) but I expect it would take some time.
I am wondering if it works for Cambridge Festival: to highlight SMEs context in Charnwood (or Loughborough) compared with the full list of businesses in Britain (For Charnwood) NNDR data mapping in OSMM using UPRNs + (For Britain) Companies House data mapping in Census polygons with Postcodes
@matkoniecz, Could you recommend other way to map the data (with address) in OS Master Map polygons during the meetign today?
@matkoniecz can you tell me what we need to have in an SME/Land use database to allow us to map to building- is it just centroid coordinates (which can be found in the open UPRNs? If the address and building number is there but not coordinates can we still not map because we have no way of linking comprehensive addresses these to the polygons?
Centroids are enough if accurate enough, except some edge cases like C-shaped or O-shaped buildings where centroid is outside building geometry.
Address can be used to reconstruct centroids, but that requires geocoding with dataset complete enough.
UPRN/TOID in UK would be the best
@MKIM1008 could you talk to @matkoniecz about this so you can share tasks. Many thanks
@matkoniecz can you tell me what we need to have in an SME/Land use database to allow us to map to building- is it just centroid coordinates (which can be found in the open UPRNs? If the address and building number is there but not coordinates can we still not map because we have no way of linking comprehensive addresses these to the polygons?
Centroids are enough if accurate enough, except some edge cases like C-shaped or O-shaped buildings where centroid is outside building geometry.
Address can be used to reconstruct centroids, but that requires geocoding with dataset complete enough.
UPRN/TOID in UK would be the best
@matkoniecz, Can we talk the way above to map the SME data (only with address and sometimes building reference number) in OS Master Map polygons during the meeting today?
An alternative way to derive UPRNs https://www.findmyaddress.co.uk/search
Recommended by https://www.local.gov.uk/our-support/research-and-data/data-and-transparency/using-unique-property-reference-number-uprn
@polly64
UPRN missing cases:
NNDR map in Charnwood
Loughborough
@polly64
Suggestion: visualisation for screen showcase
SME locations linked to transport context
In census area context (e.g. Built-up areas and SME distribution) - updating
@polly64,
NNDR map in Charnwood + important buildings from OS Open Map Local
Loughborough
North Loughborough and town centre
@matkoniecz, Could you please let me know how to link UPRNs to OS Open Map Local building polygons? Once I linked them some missing data was found - around 300 cases without UPRNs from OS Open Map Local.
Cambridge Science Festival demo
Has it completed or is it still continuing?
@MKIM1008 Comparison among:
@polly64,
I was wondering shoud we categorise student halls as Education-hotels/boardings/guest houses (non-residential) on campus. - they categorise normally for term use so it can't be residential buildings?
UK planning portal (MHCLG, 2020a) London Building Stock Model National LaDOS’s Addressbase National Land Use Database (NLUD) UK National Asset Register (NAR)
@polly64 Should we consider the Constituency boundary? - electoral boundary. Since Loughborough represents the boundary titled as Loughborough C.C
1) Loughborough County Constituency boundary (assembly of towns/wards for votes)
2) Loughborough Town Deals boundary (from Built Up Area) - we are under this programme.
I found groups of London boroughs make of London Constituency boundaries. - Colouring London doesn't consider this.
https://en.wikipedia.org/wiki/List_of_London_Assembly_constituencies
@MKIM1008 yes interesting thanks
@polly64
Business activitiy code in NNDR: Scat code & Primary description code https://assets.publishing.service.gov.uk/media/5a804472e5274a2e87db8d3c/Data_Info_and_Methodology.pdf
NNDR seems using mixed information from column A - J.
@matkoniecz @polly64
1) Most NNDR datasets from individual councils use property reference, instead of UPRN, but a council mentioned it's UPRN - I don't think so. Could we explore if the property reference number in NNDR could be linked to building polygons?
2) VOA is using UARN (Unique Address Reference Number) in their Non-Domestic Rating datasets.
UARN: VOA internal key used to link information about the same hereditament - Mihyun's asked VOA FOI for the list.
UARN: 11966889000 <html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
1299*2710*N006*N00609003481*CW1*LAND USED FOR STORAGE AND PREMISES*11966889000*PARK FARM ALNE STATION | ALNE | YORK | NORTH YORKSHIRE**PARK FARM ALNE STATION*ALNE*YORK**NORTH YORKSHIRE*YO61 1TT**C*6700**26585372000**148G****33560917282*01-APR-2023** | | -- | -- | -- | -- | -- | --
@polly64 @matkoniecz
SMEs data has address issues as they do not match the OS UPRNs.
-> I used OSM to obtain XY coordinates to join with OS UPRNS but OSM currently lacks the addresses.
I was wondering if we could resolve the issue using OS Code-Point/others we would use.