ThreeSixtyGiving / standard

The 360Giving data standard for UK philanthropic giving
http://www.threesixtygiving.org
Other
10 stars 15 forks source link

Leading zeros in organisational identifiers #222

Closed morchickit closed 6 years ago

morchickit commented 6 years ago

We got a question about leading zerosin organisational identifiers from Trafford Council, and I realised there is no information about it in the documentation. What is the status on leading zero and can we add it to the documentation?

BobHarper1 commented 6 years ago

I think we did have something previously. I think it went with the review of that page. I found this https://github.com/ThreeSixtyGiving/standard/compare/d25b355e4fd368751a4f6888a13f210dd99529ea...381badc5ca94eb45de8cf284f0a570ac66bc02d5 (Line 122) which was added then removed between two close commits.

I think it needs more advice than simply 'use the leading zero' as people are often caught out by software that removes it without their realisation.

stevieflow commented 6 years ago

To be clear though - this is about

-- The issue is that for some, the leading xero "disappears", which is (we think) due to MS Excel making assumptions (does this happen in other spreadsheet software?).

In turn, we think this is a documentation issue, but more implementation guidance rather than core via the description of this field in the schema:

Registered UK company number, if applicable.

As @BobHarper1 indicated, some guidance was included , but removed through discussion. And - the issue is not as straightforward or generic

@morchickit agree? @K_Duerden also agree?

Further - we can't consider schema validation on this, as a) it would render lots of data invalid b) we're heading to a UK-centric implementation issue

stevieflow commented 6 years ago

(sorry, closed by mistake)

KDuerden commented 6 years ago

Edafe created this guidance, in response to publisher questions on this. This page was orphaned during our recent update:

https://www.threesixtygiving.org/supportobsolete__trashed/guidance/do-i-need-the-leading-0-in-company-numbers/

BobHarper1 commented 6 years ago

That's great, I think we could work that into the documentation keenly enough.

itsozz commented 6 years ago

Thanks @morchickit for posting the question and to everyone for the comments on this, exactly what I needed to know.

@stevieflow you're absolutely correct in your assumptions as far as I'm concerned - it's spreadsheets with cells formatted as numbers that are the issue. I got the same behaviour in MS Excel as I did in LibreOffice. I have these cells formatted as text as the standard does define the company and charity number fields as type "string", however perhaps some publishers don't notice if the software changes it as @BobHarper1 mentioned.

The document created by Edafe explains the issue really well, thanks @KDuerden.

KDuerden commented 6 years ago

Practically in supporting publishers I've tended to take a pragmatic approach and focus efforts on reinstating missing 0 when creating the Org IDs (which can be easily included as part of formulas to add the prefix), but not getting to hung up on the Recipient Org:Company Number column, because that hard work can be easily undone in the moves between Excel vs CSV.

BobHarper1 commented 6 years ago

Adding to this issue - because of a recently discovered quirk with respect to saving files from OpenRefine which can cause company and charity numbers to become 12345678.0 (therefore potentially making identifiers wrong and breaking links in GrantNav)

The solution is the same however: format as text, not number.

So I think the approach we take should also cover Recipient Org:Charity Number as well.

stevieflow commented 6 years ago

saving files from OpenRefine

I think it's only when you export as .xlsx though - not .csv?

BobHarper1 commented 6 years ago

Just tested this to make sure (by exporting .xlsx and .csv from Refine and then running through COVE) - yes that's correct, xlsx (and presumably xls) have that effect.

So it's not only saving from Excel to CSV, it's also OpenRefine to Excel.

morchickit commented 6 years ago

Ok, so should we then update @BobHarper1 on the website (where Edafe's doc is) and maybe put it somewhere more prominent? @KDuerden is that ok?

KDuerden commented 6 years ago

Linking from the Org Identifier section in References docs would be good. In terms of location, it seems to fit best either after the 'Additional 360Giving fields for charity and company number' section. Or maybe in the top part of http://standard.threesixtygiving.org/en/latest/identifiers/#organisation-identifier, under the 'For example' about Charity numbers?

@morchickit we need to at least sort the location in wordpress to tidy up the url.

BobHarper1 commented 6 years ago

@KDuerden probably a hint under 'For example' would be best - not all will necessarily use the additional fields but the org-id is a required field

KDuerden commented 6 years ago

Sounds good. I've cleaned up url to the page: https://www.threesixtygiving.org/support/company-numbers/

morchickit commented 6 years ago

Great, Bob, I can add a link through the documentation!

On Fri, Dec 8, 2017 at 10:36 AM, KDuerden notifications@github.com wrote:

Sounds good. I've cleaned up url to the page: https://www.threesixtygiving.org/support/company-numbers/

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ThreeSixtyGiving/standard/issues/222#issuecomment-350230965, or mute the thread https://github.com/notifications/unsubscribe-auth/ADHzuy94m1o_5qfY3IuJxjQ6WaJ9l8Tbks5s-REVgaJpZM4QWuIQ .

BobHarper1 commented 6 years ago

Do we want to review the text on that page or happy with it as it is?

It's simple stuff, but might be worth adding a link to how to change the column format in Excel:

You can fix this in Excel by formatting the column as text. to You can fix this in Excel by formatting the column as text

BobHarper1 commented 6 years ago

Here we are now http://standard.threesixtygiving.org/en/222-leading-zeros/identifiers/#organisation-identifier

KDuerden commented 6 years ago

Bit more wordy, but I think it is worth spelling out the issue so people know why to care. Suggested edit:

UK company numbers are a unique combination of eight numbers but in some cases it includes letters. The majority of company numbers for companies registered in England and Wales start with a leading zero. Publishers should be aware of the problems that missing leading zeros in UK company numbers present when creating identifiers. Learn more about how to avoid this pitfall.

BobHarper1 commented 6 years ago

@KDuerden I think the changes I've made below is a bit more precise?

UK company numbers are a unique combination of eight numbers digits but in some cases it includes letters, which in some cases include letters as well as numbers. The majority of company numbers for companies registered in England and Wales start with a leading zero.

Publishers should be aware of the problems that missing leading zeros in UK company numbers present when creating identifiers. Learn more about how to avoid this pitfall.

KDuerden commented 6 years ago

Perfect. Happy to go ahead with this.