Closed tdavis9 closed 2 years ago
Thanks, @tdavis9. What would you recommend as an appropriate maximum length?
For 1.2, we can't add a validation rule (which would be backwards-incompatible), but we can add a recommendation in the field's description for an appropriate length.
@jpmckinney we would recommend a max of 150 to 200 characters. Understood that for this it would need to be a recommendations. Here is an example of one of the titles that was problematic:
The framework agreement may be used by INTRAN members namely, Age UK Norwich, Big C, Benjamin Foundation,Breckland District Council, Broadland District Council, Broadland Housing Association, Cambridgeshire CRC, EACH, East Coast Community Healthcare, Equal Lives, Essex County Council, Essex CRC, Flagship Housing, Forest Heath District Council, St Edmundsbury Borough Council, Freebridge Community Housing, Great Yarmouth Borough Council, Healthwatch NorflkHrtfordshire County Council, James Page University Hospitals NHS Foundation Trust, King's Lynn and West Norfolk Borough Council, Kings Lynn Area Resettlement Support (KLARS), Leeway Domestic Abuse Services, Lighthouse Women's Aid, Magdalen Group, Mancroft Advice Project (MAP), Marie Stopes, Matthew Project, Mid-Norfolk CAB, Mistura Informatics — Choice and Medication, Mundesley Hospital, NHS England, NHS Great Yarmouth and Waveney CCG, NHS North Norfolk CCG, NHS Norwich CCG, NHS South Norfolk CCG, NHS West Norfolk CCG, Norfolk & Norwich University Hospital, Norfolk and Suffolk NHS Foundation Trust, Norfolk and Suffolk Probation CRC, Norfolk CAB, Norfolk Community Health and Care, Norfolk Community Law Service, Norfolk Constabulary, North-Norfolk District Council, Norwich & Central Mind, Norwich City Council, Norwich Charitable Trusts, Norwich Consolidated Charities, Ormiston Victory Academy, Right for Success Academy, Saffron Housing Trust, South Norfolk Council, Sue Lambert Trust, Suffolk Constabulary, Suffolk County Council, City College Norwich, City Academy Norwich, Wayland Academy, Fakenham Academy, Attleborough Academy, The Queen Elizabeth Hospital Kings Lynn NHS Trust, Together for Well Being, Victory Housing Trust, West Suffolk Hospital NHS Trust, Wherry Housing Association, Norfolk
For fields that would likely be used in an index, 800 characters is the max that we've seen it function decently at.
Indeed – putting all the buyers within a framework agreement in the tender title is not helpful :)
Can we check a few collection summaries to get the 90% percentile title length? e.g.
SELECT PERCENTILE_CONT(0.9) WITHIN GROUP(ORDER BY LENGTH(tender_title)) FROM view_data_bi_tool_portugal.tender_summary;
Note that some publishers don't have tender titles.
Update: There's probably duplication of data sources among these schema, but the range is 15-301 characters for the 90% percentile. Among these numbers, the average is 98, median is 92, and 90% percentile is 151.
If we choose a limit of 150, then about 90% of datasets will have 90-100% of their titles with fewer than 150 characters. And 10% of datasets will have 10% or more of their titles with more than 150 characters.
With this update, the 90% percentile is 148.1. We have 33 publishers with information about tender titles and 86 publishers where the value is empty value or don't have.
I think we can choose the limit between 160 and 200 characters.
Another recommendation we can add and is very helpful for publishers is to recommend that the information about the organizations should be in parties and not list all in tender.title
@jpmckinney @yolile What do you think of the suggestions?
Let's go with 150.
We found when trying to use UK OCDS data that some of the titles were so long (mostly because they were listing information such as organizations in the title) that when we tried to use this for charts and exporting it, it essentially broke the system, and we had to create our own limits and cut them off at a certain point. Signifying how long a Tender Title (and other similar fields) could be would help make the data more usable.