Open RichardBruskiewich opened 3 years ago
Maybe undesirable to allow users to tweak existing KGE File Sets. Rather, insist on every changed version being a new SemVer versioned release.
It would be helpful to allow users more flexible file set creation options on the upload.html
web form: e.g. to select an existing files set to copy of files of an existing KGE File Set into a new File Set upload form context, then delete / overwrite / upload other new files, then click "Done Uploading" to save it as the new version.
Does raise question about who should be allowed to copy/duplicate data of an existing file set.
From the perspective of primarily being a KG "consumer" (as opposed to a KG "producer"), I tend to view ability for publishers to modify their data sets whole cloth as more of a bug than a feature. Immutable data is far easier to consume than mutable data, and most of KGE is IMO better poised to offer a predominantly immutable model than a predominantly mutable model.
One bright line that is crossed is the "Done Uploading" button (implemented using the /archive/publish endpoint). From that point on, the main way that the user has to correct errors is to publish a new version. I don't think there is much wrong with saying it's the only way.
I can anticipated two cases in which uploading a new version being the only way to change could be a problem.
1) Uploading a version with a known debilitating bug. For this, I would suggest having an operation for a publisher to mark a file set with flags such as "retracted", "deprecated" or "buggy", so that consumers can find out that it's no longer recommended for use. But it seems like it should be on consumers to decide how to respond to this communication, as opposed to preventing them from downloading the data. This seems worth implementing, but not before there are at least a few users.
2) After uploading something according to an incorrect license agreement. That is, "oops sorry didn't have permission to publish this", please retract. This is important but rare. I think it is probably rare enough that handling by an email and manual s3 rm is fine.
What happens if the same group uploads two different versions of their graph on the same day? Various use cases: