Open seanherron opened 11 years ago
Hi, Sean.
A few reasons. The main one to be wary of is that organizations are permissions structures in CKAN. I don't think it would be appropriate to map harvested datasets to organizations based on the publisher field. You risk giving write-permission to something that shouldn't be edited, or losing permission to update it later. If anything, they should all map to a single Harvester organization.
Groups might be more appropriate.
But also, publisher is a string field. Mapping to organizations/groups may be complex and the logic may depend on whose catalog it is. Also, managing the creation/updating/deletion of organizations/groups is a lot more work that I didn't want to get into.
Am definitely not opposed to seeing a way to map datasets to groups though. That'd be very handy.
"You risk giving write-permission to something that shouldn't be edited, or losing permission to update it later. If anything, they should all map to a single Harvester organization."
Don't understand. Are the permissions that are gained/lost with regard to editing metadata?
Our ckan system will use data.json to populate the catalog. we are also offering the ability for users to log in to enter their metadata, upload files, etc. The metadata is used to produce data.json files for the organization.
"But also, publisher is a string field. Mapping to organizations/groups may be complex and the logic may depend on whose catalog it is. Also, managing the creation/updating/deletion of organizations/groups is a lot more work that I didn't want to get into."
Yes, mapping may be inherently complex. We'll likely have to use some machine learning if we start seeing significant variations in the entered data. For right now, though, we'll hope that we won't have to many endpoints to harvest and so can establish standards and procedure to minimize the technical problem of establishing identity.
For us, organization/grouops are great as they're already baked into CKAN already.
Given this information, are there other reasons not to use publisher?
Well, like I said, I don't think orgs makes sense. Groups makes sense.
But if you guys submit a patch to do either, I'd be glad to merge it.
Oh, see #5 though --- I merged Fuhu Xia's patch assigning datasets to the org that owns the harvester source.
I'm working on a modification to the extension to parse out data.json files by the organization they belong to in CKAN. One question I have is with the implementation of the
publisher
field - why does it map toauthor
in CKAN rather than toorganization
? Was going to change this around but wanted to check on the rationale behind it first. Thanks!