Open asmecher opened 1 year ago
How do we treat submissions that were created with Quicksubmit and sent to production and were then worked on in the production stage and published later through the normal workflow? Just a thought, not sure if this needs to affect the issue at all...
I like it. A few more use cases to consider:
Some recommendations:
workflow
to submitted
, since they are being created by the submission wizard not the editorial workflow. quickSubmit-heuristic
to quickSubmit-guess
just because heuristic
is kind of jargon-y.unknown-upgrade
, it should be set to null
. We are going to run into cases where submissions are created through third-party scripts or tools, so allowing the field to be null
is a good way to indicate "we don't know".quickSubmitHeuristic
, or dasherized, quick-submit-heuristic
, but not mix the two.Finally, while we're doing this, let's add a created_at
column to the submissions
table. This is different from date_submitted
and date_published
and could be useful in conjunction with the source
column.
On second thought, what about just quickSubmit-old
or quickSubmit-legacy
to indicate that they represent historical data?
How do we treat submissions that were created with Quicksubmit and sent to production and were then worked on in the production stage and published later through the normal workflow?
@ajnyga AFAICS it will fail to be recognized as an imported submission only if you unpublish it, then change the publication date to somewhere in the future (after the submitted date), if you're creating new versions it should work fine.
- Instead of unknown-upgrade, it should be set to null
Agreed.
- Let's stick to camelCase or dasherized
If we're going to write custom keys, I vote for whichever format we're using the most (I personally prefer dasherized for database keys).
Alternative:
quickSubmit
and quickSubmit-guess
, I'd just keep quickSubmit
to avoid creating room (a space in the report with "Submissions probably imported by QuickSubmit plugin") for something that's not going to exist in the future.Finally, while we're doing this, let's add a created_at
Agreed, I've created a discussion for this here: https://github.com/pkp/pkp-lib/discussions/7977
In the absence of historic data/audit/backups, solving unexpected bugs/issues becomes much easier when you have access to some timestamps.
@defstat made some suggestions to me on Mattermost that involved creating a better machine-readable audit trail that covered the submission's lifespan in general, not just where it was created. In my opinion that risks becoming much bigger than the use cases documented here, but one way of achieving it without creating a new mechanism would be to use the existing event log toolset. It is already machine-readable-ish, though we do need to improve our current practice around numeric constants for event_type
-- it does not grow well e.g. across upgrades and with plugins.
If we choose the event log, resolving this issue could be done by just making sure there are appropriate event log entries for where the submission is created that are different for each source (SWORD, QuickSubmit, XML import/export, etc).
This has downstream benefits of potentially letting us capture the provenance of a submission even if it moves between systems -- for example, a QuickSubmit submission that was later exported and imported via XML.
I think event logs should be used mostly for auditing generic actions (if I wipe out this table, it shouldn't cause much harm to the system). As this information is somehow part of the submission state, and will be used at the statistics page, I think it's better to have a dedicated field for it instead of looking for data in the trunk, but that's just my opinion =]
I think every action should be recorded in the activity log, including an import, quick submit, "normal" submission, etc. If we need data of the submission state, it should be recorded as last_source
or something, to indicate that it can change and doesn't represent immutable data.
My proposal for that is to extend the "audit recordings" (activity log/event log) as needed and then use them as the base/source to populate the (or any related) new column with, for example, the indicator to whether a submission should be taken into account for the statistics. The statistical data will be retrieved the way they are now, no changes into that.
@defstat's draft PRs:
A few quick thoughts:
The iSubmissionIntroducer
interface doesn't add much, in my opinion, since anything that introduces a submission needs to pass itself into the Repository::add
function could just as easily pass in a log entry. If the ::add
function required a SubmissionIntroducerEventEntry
parameter, then it could let the calling code instantiate it. The SubmissionIntroducerEventEntry
could have a
final public function getEventType(): int
{
return SUBMISSION_LOG_CREATED;
}
...and possibly even something like...
public function getSource(): string
{
return $this->getData('source');
}
public function getParams(): array
{
return array_merge($this->getData('params'), [
'source' => $this->getSource(),
]
}
This way calling code will need to instantiate and configure a SubmissionEventLogEntry
before calling Repository::add
. It doesn't build in any requirement that SubmissionEventLogEntry
be subclassed, which is good IMO because we can expect plugins to use this facility and their subclasses will come and go.
Down the line I think it'll be sensible to expect the event log DAO to instantiate objects of the right class, but we don't need to tackle that right now.
Feel free to experiment with any of this!
@defstat, I'm deferring this -- we likely won't have time to get it merged before RC1.
@asmecher New PRs with requested changes
OJS: pkp/ojs#3905 PKP-LIB: #8996 QuickSubmitPlugin: pkp/quickSubmit#72
@defstat, I'm going to hold off on reviewing this until @Vitaliy-1 has gotten https://github.com/pkp/pkp-lib/issues/8933 merged; it'll probably mean some changes are required here too.
Wonderful idea! Thanks for working on this!
Describe the bug At the moment, submissions can be created via 3 mechanisms:
Later, it's sometimes necessary to try to distinguish between these sources using heuristics. For example:
Recommendation:
submissions
table, called e.g. "source", allowing a plain-text indication of the submission's origin.Suggested vocabulary for the column:
nativeImportExport
for submissions that were imported via the native import/export pluginquickSubmit
for submissions that were created by the quickSubmit pluginworkflow
for submissions that were created using the normal editorial workflowquickSubmit-heuristic
for submissions that were upgraded from a version without thesource
column, but appear by heuristic to have been created using quickSubmitunknown-upgrade
for submissions that were upgraded from a version without thesource
column, and didn't appear to be created usingquickSubmit
by the heuristic