retorquere / zotero-date-from-last-modified

76 stars 0 forks source link

Should the date be 'sanity-checked' prior to being updated? #18

Closed jgrisham closed 9 months ago

jgrisham commented 10 months ago

Observed behavior

On many (over 600 items in my personal Zotero library) items (example #1), the Date field is apparently set by this plugin to '1969-12-31'.

Related?

While I don't know if also caused by this plugin, I have a dozen or so items with the Date field only containing a time (e.g. '21:54:00 +0100' for this URL).

Possible solution

I don't imagine a limited number of checks would add significant overhead to the plugin? ¯\(ツ)

Example - only update date if (all?) of the following are true:

  1. Date field is blank (already implemented - thanks, Emiliano!)
  2. The calculated date is 1990 or later (does anyone / any CMSs actually back-date HTTP headers for > 33 year-old documents / publications?)
  3. The calculated date is prior to any of the automatic date fields ('Accessed', 'Date Added', 'Modified') that are not blank
  4. The calculated year is, say, 2200 or earlier (I realize this creates a 'Y2.2k problem', but it might catch 'over-range' dates from malfunctioning / mis-configured webservers)

(I'll try to take a look at the code myself if I can, but I wanted to make sure to share my observations before the week got away from me.)

This is a great concept for a plugin; thank you for sharing it with the world!

Cheers,


Version details

Zotero version: 6.0.26 (Windows) Zotero Date From Last Modified plugin: 0.1.0

github-actions[bot] commented 9 months ago

:robot: this is your friendly neighborhood build bot announcing test build 0.1.3.18.14 ("fixes #18, part 1")

Install in Zotero by downloading test build 0.1.3.18.14, opening the Zotero "Tools" menu, selecting "Add-ons", open the gear menu in the top right, and select "Install Add-on From File...".

retorquere commented 9 months ago

On many (over 600 items in my personal Zotero library) items (example #1), the Date field is apparently set by this plugin to '1969-12-31'.

  • (That seems like a 'default'/'epoch' date, and I'm not sure if it's coming from the HTTP headers for that page or from the plugin.)

That does look an epoch date, shifted one day. The URL in the sample doesn't exhibit the problem (anymore), but I try first to convert the date to UTC, and if that's the unix epoch of 1970-01-01, I make no changes.

Related?

While I don't know if also caused by this plugin, I have a dozen or so items with the Date field only containing a time (e.g. '21:54:00 +0100' for this URL).

Nope, I never set a time in any way with this plugin. That value comes from the standard scraper.

* I can't imagine I would have entered those, but perhaps they were populated by _Zotero_ itself.

The zotero scraper, yes.

1. Date field is blank _(already implemented - thanks, Emiliano!)_

2. The calculated date is [1990 or later](https://www.google.com/search?q=first+web+server) _(does anyone / any CMSs actually back-date HTTP headers for > 33 year-old documents / publications?)_

Maybe not, but that's what the URL claims. I can see that the epoch date is unlikely, but for this, let's first see if we find samples that necessitate it.

3. The calculated date is _prior_ to any of the automatic date fields _('Accessed', 'Date Added', 'Modified')_ that are not blank

Any live URL is going to be older than the date added/modified? Accessed I could see.

4. The calculated year is, say, 2200 or earlier _(I realize this creates a 'Y2.2k problem', but it might catch 'over-range' dates from malfunctioning / mis-configured webservers)_

2200 is a loooooong time from now, so uninstalling the plugin would het you the same behavior ;)