ecolabdata / ecospheres-dashboard-backend

0 stars 0 forks source link

Better `Resource.format__exists` #20

Open abulte opened 3 weeks ago

abulte commented 3 weeks ago

format__exists currently returns True for those values:

prod=# select format__exists, format from resources where dataset_id = '6704893f21c8a307c119c1f5';
 format__exists |              format
----------------+-----------------------------------
 t              | nd7f0a030ff6449129765bec5e6e8a3cd
 t              | n5c86e2753ec34d0a81bd51147ea3c274
 t              | n0cb3d8ebefd6447ba203fbc964d94e6e
 t              | n189c9811e9524fab8bf1221b2b6ec37e
 t              | n0b062c5f40694f1d9aad850c712312d2
 t              | nd3916b2360734843948283a50984b241

They're node identifiers and not "real" formats. Maybe we should handle them as False, using a test like in https://github.com/opendatateam/udata/blob/e7a2526f6a2fd084a1fedb08cfdf7b15957c6466/udata/harvest/tests/test_dcat_backend.py#L888.

This a known data.gouv.fr harvesting bug.

streino commented 3 weeks ago

IIRC this is the id of the anononymous format node. Don't remember if we should fix this at SEMIC or data.gouv level.