ThreeSixtyGiving / standard

The 360Giving data standard for UK philanthropic giving
http://www.threesixtygiving.org
Other
10 stars 15 forks source link

Company number should be a string of 8 characters #121

Open stevieflow opened 8 years ago

stevieflow commented 8 years ago

We could enforce an 8 character string for company number - to highlight issues where the number is not valid

Use case: this can often be the case when the initial 0 is lost via spreadsheet formatting. This might be one step to address / fix this

(via: https://github.com/OpenDataServices/flatten-tool/issues/87)

timgdavies commented 8 years ago

Agreed that this would be good. Might need to think carefully about the RegEx etc. to do this with....

ekoner commented 7 years ago

Not all company numbers start with 0 - certain types of companies have a 2 character prefix e.g SC https://beta.companieshouse.gov.uk/company/SC323716

BobHarper1 commented 7 years ago

Scottish companies are prefixed with SC, Northern Irish with NI, both followed with and eight-digit number starting with 0 (I believe the remainder always starts with '0' - I've not seen any different but it may be the case).

English and Welsh incorporated bodies use just the eight-digit string.

There are a number of variations on the prefix, but always a two digit string.

RegEx something like /^[a-zA-Z]*?[0-9]{8}$/? https://regex101.com/r/53Eunz/2

stevieflow commented 6 years ago

@robredpath can you point / provide details of what we have in the additional checks in CoVE, right now?

robredpath commented 6 years ago

@stevieflow see https://dataquality.threesixtygiving.org/additional_checks :

Heading: a value provided in the Recipient Org: Company Number column that doesn’t look like a company number

Message: Common causes of this are missing leading digits, typos or incorrect values being entered into this field.

Method: Checks if any grants have RecipientOrg company numbers that don't look like company numbers. Checks if the value is 8 characters long, and that the last 6 of those characters are numbers