If a user has entered a search term and/or has selected some filters, we should expect the formats filter to only include formats of datafiles of datasets that are included in the search results.
We can get this information by including the use of faceting in the Solr query. We can specify the res_format field as a facet, and only return any where the minimum count is 1.
The facet data returned includes a number of formats that aren't valid or display friendly (users are able to add this manually/part of a data import). We need to clean up this data so that it maps to valid formats, and enforce some validation so that users can't add custom formats in future.
This query lists all available formats and a count for each in the Solr UI - /solr/#/ckan/query?q=*:*&q.op=OR&indent=true&facet=true&facet.field=res_format&facet.sort=count&facet.mincount=1&fl=res_format&rows=1
We agreed that the best workaround to start with is to map anything that doesn't fall under the default list as other - we can always review the list and do a cleanup of the formats later.
If a user has entered a search term and/or has selected some filters, we should expect the formats filter to only include formats of datafiles of datasets that are included in the search results.
We can get this information by including the use of faceting in the Solr query. We can specify the
res_format
field as a facet, and only return any where the minimum count is 1.The facet data returned includes a number of formats that aren't valid or display friendly (users are able to add this manually/part of a data import). We need to clean up this data so that it maps to valid formats, and enforce some validation so that users can't add custom formats in future.
This query lists all available formats and a count for each in the Solr UI -
/solr/#/ckan/query?q=*:*&q.op=OR&indent=true&facet=true&facet.field=res_format&facet.sort=count&facet.mincount=1&fl=res_format&rows=1
See WIP - https://github.com/alphagov/datagovuk_find/commit/1f9ee3c1e7db347c3bcf5f4ed1cb831016b7f9aa on branchtopic-format-filters
For more information on faceting see https://solr.apache.org/guide/8_11/faceting.html