colouring-cities / colouring-britain

Developed out of the Colouring London prototype. Collecting data on Britain's buildings and testing new core features
https://colouringbritain.org/
GNU General Public License v3.0
10 stars 2 forks source link

Data: SME content/Landuse #285

Open MKIM1008 opened 8 months ago

MKIM1008 commented 8 months ago

@polly64 @matkoniecz

SMEs data has address issues as they do not match the OS UPRNs.

-> I used OSM to obtain XY coordinates to join with OS UPRNS but OSM currently lacks the addresses.

I was wondering if we could resolve the issue using OS Code-Point/others we would use.

MKIM1008 commented 8 months ago

I've tried some methods to derive coordinates for the SMEs simple point map (1,113,937 cases).

-> e.g. OSM doesn't have enough addressesleaving lots of empty coordinates and Google Maps API should be the paid one to cover all lists including the huge cases in Loughborough, E.Midlands and the whole UK.


@polly64 I will simply draw the point map using OS Code Point (postcodes) first next week, to see the brief spatial density and discuss how to map the list in OSMM + UPRNs with @matkoniecz. It may be similar to the recent UPRN discussion Mateusz brought up.

polly64 commented 8 months ago

@MKIM1008 Great . Just to say We just can't derive any data from osmm that is then made available in our open downloads. So if we map manufacturing unfortunately it can only be done using open products

MKIM1008 commented 8 months ago

@polly64 Loughborough SMEs map with Postcode: image

Each postcode on the map ignores the overlapped locations of SMEs. Heatmap or other analysis (colouring with SIC code) will be proceeded.

MKIM1008 commented 8 months ago

There are only 4-5 businesses in the three postcode. It looks a big shopping mall. We know the mall is sourrounded by high streets. Streetview is 2018 and therefore we cannot use it here image

polly64 commented 8 months ago

Hi @MKIM1008 could you possibly add some notes/edit the ones i have made below, for Kate/ Cambridge @KatePT @taimaz22 issues regarding the SME issues with mapping.

Hi @matkoniecz could you give @@KatePT access to editing and so i can assign stuff - many thanks

SME and Manufacturing maps of Britain Notes on planning and current problems we are attempting to resolve:

  1. SME/SIC codes database: We would ideally like to link every building in Britain to the SIC database which also provides data on SMEs so that all non-domestic activities can be mapped in a way that means. The Standard Industrial Codes (SIC) codes are very comprehensive and MKIM is now comparing the database with the ISIC international codes. @MKIM1008 could you add the total number of categories here and the government link, and any other info that's relevant? - thanks. Advantages are: a) a very detailed description is provided e.g wholesale fishmongers, retail fish supplier, fish canning, fishing boat storage facilities etc. b) regularly updated by government c) available for whole of Britain c) looks like able to be compared across countries via similar ISIC codes (we are comparing now) d) does not differ significantly in terms of te CCRP's current highest tier therefore a minimal amount of collected data will be lost. Mihyun could you take a screenshot of SIC excel file headings and add here? thanks

  2. Problems with mapping SME SIC codes we have identified a) MKIM used OpenStreetMap to obtain open XY coordinates to join with open OS UPRNS, but OSM currently lacks the addresses. @MKIM1008 can you add why addresses are also needed as well as XY coordinates? b) Map coordinates/UPRNs are not provided in the SME/SIC file but addresses are b) Mapping against OS open code point only allows geolocation at postcode level which is too genertalised c) As image above also shows there is an issue with all SME's being displayed at postcode level- as there are many more known to be there than shown when points are clicked - MKIM is investigating d) Google street view images for Loughborough are very out of date (2018), whereas for Oxford Street London they are only 8 months old. This means crowdsourcing using streetview images whcih we were planning to do as a test for the festival will onot be possible. This will have to be done manually using a phone, or through a call out to residents, if no automated way of populating SME's/manufacturing can be identified. e) the SIC file is very large . Ideally we would like to link to an API. MKIM is investigating this but we think it might only allow download of one item at a time.

  3. Need more sources to data a) @MKIM1008 can you list any other sources you are using here, thanks b) @KatePT if you can add any more sources of SME and manufacturing data you know about here that would be great. MKIM is also talking to Loughborough's Business dept and to the Chamber of Commerce. c) The main thing is to first understand what we do and don't have. Once we know we are missing something we can advance on a mapping strategy to fill the gaps and identify key people/government departments we may need to talk with. Do you know if Make UK's database is open? A key aspect is the need for regular updating. We would also like to look at the vacant and derelict mapping with a colleague working in this area

polly64 commented 8 months ago

@matkoniecz can you tell me what we need to have in an SME/Land use database to allow us to map to building- is it just centroid coordinates (which can be found in the open UPRNs? If the address and building number is there but not coordinates can we still not map because we have no way of linking comprehensive addresses these to the polygons?

MKIM1008 commented 8 months ago

@polly64 SMEs list from Companies House without UPRNs / UK level / SIC code @MKIM1008 could you add the total number of categories here and the government link, and any other info that's relevant? : In SMEs list: 53 columns, https://download.companieshouse.gov.uk/en_output.html In SIC code: 21 categories, 731 sub-categories, https://resources.companieshouse.gov.uk/sic/

@MKIM1008 can you add why addresses are also needed as well as XY coordinates?. : The addresses are being used to obtain XY coordinates for mapping in GIS. These XY coordinates will then be used to link the OS Master Map with UPRNs.

Business rates from the council with UPRNs / Council level / Business description @MKIM1008 can you list any other sources you are using here, thanks : I have a dataset on National Non-Domestic Rates (NNDR) that includes UPRNs. This full list is compiled by each council and could offer annual live-streaming; to track of business closures, openings, and changes over time by their tax assessment. Limitations: 1) the document includes empty properties. 2) the description looks from the SIC code but generalised-> However, the specific SIC code can be joined by the company name, etc. if necessary. image

https://www.charnwood.gov.uk/pages/foi_request_business_rates

I will contact with the council business team to ask about the business classification

Other references (ongoing) Non-domestic rating: Stock of properties including business floorspace https://www.gov.uk/government/collections/non-domestic-rating-stock-of-properties-collection#2023

Business rates at lower levels of geography in England and Wales https://www.ons.gov.uk/businessindustryandtrade/business/activitysizeandlocation/articles/businessratesatlowerlevelsofgeographyinenglandandwalesresearchupdate/november2022

polly64 commented 8 months ago

@MKIM1008 thanks so much could you possibly jot down some things below just so I am super clear: i) could you list problems here with accessing right open address data to enable us to map manufacturing/SMEs? ii) could you give all open address sources at Loughborough and Britain scale we could use, here with a note on any issues with each? iii) With the OS OML/INSPIRE merge where @matkoniecz is adding open UPRNs, how do we link the SIC codes from the companies house dataset? iv)Why are we having to link the companies house data at post code level? v) Is there an API for the companies house dataset? vi) did you find an Further education (FE) college dataset we could map?

MKIM1008 commented 8 months ago

@polly64

Here is the list of problems and possible solutions to enable us to map manufacturing/SMEs at building level using both OSMM and the new open Inspire footprint merge.

1) Address data (i) We now have accessed to enable mapping of SMEs/manufacturing from the Companies House dataset and NNDR dataset. These are both released under Open Government Licence v3.0.

(ii) An open Python package can be used to derive XY (EPSG: 4326, Global) or Longitude-Latitude (EPSG: 27700, UK) coordinates from OpenStreetMap addresses: Nominatim is the name of the package.

(iii) We haven't yet found any other comprehensive open address sources for Britain or E.Midlands

2) Manufacturing / SME data (i) Companies house data has SIC code + addresses but without UPRNs.

(ii) National Non-Domestic Rates (NNDR)/business rate data

(iii) Companies house data has a longer list of SMEs than the NNDR data

(iv) The way the land use activities data from the council (MyProperty) is structured means that UPRNs

3) Updating SME data using an API

4) Adding the location of Further education (FE) colleges

(We can use the Church of England to obtain the data. They said the data will not be provided in Excel. https://www.arcgis.com/home/webmap/viewer.html?webmap=67bce0ed36dd4ee0af7a16bc079aa09a&extent=-0.4317,51.313,0.1986,51.5945)

5) Layers

MKIM1008 commented 8 months ago

@polly64

We have been granted access to the highstreet footfall and business sales data systems. While these datasets are not open data, public organisations (Love Loughborough and Charnwood Borough Council) have authorised us to visualise this data on our platform. Could we use this data for Colouring Loughborough as layers?

@MKIM1008 let's experiment

MKIM1008 commented 8 months ago

SIC (UK) might have more categories within sectors that are crucial to the UK economy, such as financial services, oil and gas production, or specific types of manufacturing. ISIC, aiming for international applicability, might have broader categories that can encompass a wide range of activities but with less specificity to any particular country's economic structure

polly64 commented 8 months ago

@matkoniecz add cambridge inspire open merge polygons for Cambridge festival so we can show a) international (polly to make film) b) national and regional- @mdsimpson42 to add admin and infrastructure layers c) local - @MKIM1008 to add manufacturing data to OSMM for Loughborough on live site local2- @matkoniecz to add INSPIRE open merge for Cambridge or Cambridge city centre for residents to test on staging

matkoniecz commented 8 months ago

If we do we can use other Python packages to extract the coordinates from any other open platforms like OpenStreetMap. @matkoniecz do you have any comments on this?

1) coverage may be lower than needed

2) https://www.openstreetmap.org/copyright and https://osmfoundation.org/wiki/Licence/Community_Guidelines/Geocoding_-_Guideline would apply

matkoniecz commented 8 months ago

@matkoniecz and @mdsimpson could you add a link to the Companies House database

where?

Hi @MKIM1008 can you give a bit more info on this - do we still need?

MKIM1008 commented 8 months ago
  • [ ] @MKIM1008 to map and join the NNDR and Companies House datasets in QGIS.

@polly64.

@taimaz22 mentioned the SMEs in Loughborough are expected around 600 cases. The NNDR dataset includes 800 cases in Loughborough so it looks better than the Companies House data (approx. 4000 cases).

The NNDR list will be compared with Love Loughborough's SME dataset of 600 cases. - they are the local Business Improvement District manager (?).

[NNDR in OSMM - 57% UPRN coverage in Charnwood, 2842 out of 4975] The map will be being updated here. (584 cases)

The Land Use classification will be asked to the council. image

polly64 commented 8 months ago

super helpful @MKIM1008 thanks for spotting duplicates we should inform council

MKIM1008 commented 7 months ago

@polly64

Manufacturing / SME map for Loughborough (list shown in NNDR data) image

The list of FE colleges in the UK https://www.gov.uk/government/publications/further-education-colleges-in-the-uk

FE colleges map https://www.aoc.co.uk/about/college-directory image

MKIM1008 commented 7 months ago

DEC dosen't support FE colleges data fully. image

MKIM1008 commented 7 months ago

@polly64

I've gathered NNDR lists from the 40 councils' websites in the East Midlands. - because gov.uk doesn't provide all and very recent datasets. However, only a few councils provide complete lists of non-domestic properties, but these often lack UPRNs (only North Kesteven District Council has UPRNs in their NNDR data).

I can ask them to open the NNDR list with UPRNs (supported by Freedom of Information Act, similar to Charnwood case) but I expect it would take some time.

I am wondering if it works for Cambridge Festival: to highlight SMEs context in Charnwood (or Loughborough) compared with the full list of businesses in Britain (For Charnwood) NNDR data mapping in OSMM using UPRNs + (For Britain) Companies House data mapping in Census polygons with Postcodes

@matkoniecz, Could you recommend other way to map the data (with address) in OS Master Map polygons during the meetign today?

matkoniecz commented 7 months ago

@matkoniecz can you tell me what we need to have in an SME/Land use database to allow us to map to building- is it just centroid coordinates (which can be found in the open UPRNs? If the address and building number is there but not coordinates can we still not map because we have no way of linking comprehensive addresses these to the polygons?

Centroids are enough if accurate enough, except some edge cases like C-shaped or O-shaped buildings where centroid is outside building geometry.

Address can be used to reconstruct centroids, but that requires geocoding with dataset complete enough.

UPRN/TOID in UK would be the best

MKIM1008 commented 7 months ago

@matkoniecz

https://www.findmyaddress.co.uk/search

Recommended by https://www.local.gov.uk/our-support/research-and-data/data-and-transparency/using-unique-property-reference-number-uprn

polly64 commented 7 months ago

@MKIM1008 could you talk to @matkoniecz about this so you can share tasks. Many thanks

MKIM1008 commented 7 months ago

@matkoniecz can you tell me what we need to have in an SME/Land use database to allow us to map to building- is it just centroid coordinates (which can be found in the open UPRNs? If the address and building number is there but not coordinates can we still not map because we have no way of linking comprehensive addresses these to the polygons?

Centroids are enough if accurate enough, except some edge cases like C-shaped or O-shaped buildings where centroid is outside building geometry.

Address can be used to reconstruct centroids, but that requires geocoding with dataset complete enough.

UPRN/TOID in UK would be the best

@matkoniecz, Can we talk the way above to map the SME data (only with address and sometimes building reference number) in OS Master Map polygons during the meeting today?

An alternative way to derive UPRNs https://www.findmyaddress.co.uk/search

Recommended by https://www.local.gov.uk/our-support/research-and-data/data-and-transparency/using-unique-property-reference-number-uprn

MKIM1008 commented 7 months ago

@polly64

UPRN missing cases:

NNDR map in Charnwood image

Loughborough image

image

MKIM1008 commented 7 months ago

@polly64

Suggestion: visualisation for screen showcase

SME locations linked to transport context image image

In census area context (e.g. Built-up areas and SME distribution) - updating

MKIM1008 commented 7 months ago

@polly64,

NNDR map in Charnwood + important buildings from OS Open Map Local image image

Loughborough image

North Loughborough and town centre image

@matkoniecz, Could you please let me know how to link UPRNs to OS Open Map Local building polygons? Once I linked them some missing data was found - around 300 cases without UPRNs from OS Open Map Local.

matkoniecz commented 6 months ago

Cambridge Science Festival demo

Has it completed or is it still continuing?

MKIM1008 commented 6 months ago

https://github.com/colouring-cities/colouring-britain/issues/320#issuecomment-2031504504

MKIM1008 commented 6 months ago

@MKIM1008 Comparison among:

MKIM1008 commented 6 months ago

@polly64,

I was wondering shoud we categorise student halls as Education-hotels/boardings/guest houses (non-residential) on campus. - they categorise normally for term use so it can't be residential buildings?

MKIM1008 commented 5 months ago

UK planning portal (MHCLG, 2020a) London Building Stock Model National LaDOS’s Addressbase National Land Use Database (NLUD) UK National Asset Register (NAR)

MKIM1008 commented 5 months ago

https://github.com/colouring-cities/colouring-britain/issues/61

Landuse

MKIM1008 commented 5 months ago

@polly64 Should we consider the Constituency boundary? - electoral boundary. Since Loughborough represents the boundary titled as Loughborough C.C

1) Loughborough County Constituency boundary (assembly of towns/wards for votes)

Screenshot_20240505_002419_Chrome.jpg

Screenshot_20240505_003018_Samsung Notes.jpg

2) Loughborough Town Deals boundary (from Built Up Area) - we are under this programme. Screenshot_20240505_003201_Chrome.jpg

I found groups of London boroughs make of London Constituency boundaries. - Colouring London doesn't consider this.

https://en.wikipedia.org/wiki/List_of_London_Assembly_constituencies

polly64 commented 5 months ago

@MKIM1008 yes interesting thanks

MKIM1008 commented 4 months ago

@polly64

Business activitiy code in NNDR: Scat code & Primary description code https://assets.publishing.service.gov.uk/media/5a804472e5274a2e87db8d3c/Data_Info_and_Methodology.pdf

NNDR seems using mixed information from column A - J.

image

MKIM1008 commented 4 months ago

@matkoniecz @polly64

1) Most NNDR datasets from individual councils use property reference, instead of UPRN, but a council mentioned it's UPRN - I don't think so. Could we explore if the property reference number in NNDR could be linked to building polygons?

image image image

2) VOA is using UARN (Unique Address Reference Number) in their Non-Domestic Rating datasets.

UARN: VOA internal key used to link information about the same hereditament - Mihyun's asked VOA FOI for the list.

https://voaratinglists.blob.core.windows.net/html/documents/Compiled%20Rating%20List%20and%20Summary%20Valuation%20Data%20Specification.pdf

UARN: 11966889000 <html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

1299*2710*N006*N00609003481*CW1*LAND USED FOR STORAGE AND PREMISES*11966889000*PARK FARM ALNE STATION | ALNE | YORK | NORTH YORKSHIRE**PARK FARM ALNE STATION*ALNE*YORK**NORTH YORKSHIRE*YO61 1TT**C*6700**26585372000**148G****33560917282*01-APR-2023** |   |   -- | -- | -- | -- | -- | --

image image image

3) FYI, the text below is about how Non-domestic buildings have been geo-referenced using postcode.

https://www.gov.uk/government/statistics/non-domestic-rating-stock-of-properties-2022/background-information image

MKIM1008 commented 4 months ago

@polly64

There's no description of Scat code history/root and it's an unofficial comparison between SIC and Scat.

image https://openlocal.uk/faq

polly64 commented 4 months ago

@MKIM1008 great idea to have look up of scat codes- do you want to try to start linking them tomorrow? and then I can go through helping check with you on Monday?

MKIM1008 commented 4 months ago

@MKIM1008 great idea to have look up of scat codes- do you want to try to start linking them tomorrow? and then I can go through helping check with you on Monday?

@polly64

Sure! I will work on this from tomorrow. - I just wanted you to see this next week. Thank you for your reply!

MKIM1008 commented 3 months ago

@polly64

address generation rule within a postcode

MKIM1008 commented 2 months ago

Hidden places not appeared in official documents to map: (Community) gardens, small community rooms for tea, corner shops, charity shops, etc.

Green space: categorised

Business: NGO, community led-charity, etc.

MKIM1008 commented 2 months ago

Matching land use in NNDR to building floor information - since it's business tax based, not locations or ground floors.

MKIM1008 commented 1 month ago

Knowledge Transfer Partnership (KTP)