Closed yzqzss closed 1 year ago
So are you saying we never enter this "else" any more?
The problem was that Wikia was adding <sha1/>
tags under with pages, not revisions, in the XML.
Whether Wikia is Wikia and fact that the code was added a long time ago is not really relevant. The question is whether (a) the code's presence is causing problems and/or (b) whether this ever happens (at Fandom/Wikia or elsewhere). If Fandom/Wikia is no longer exporting these invalid SHA1's, I don't really object to removing it so I haven't seen it anywhere else.
My impression is that I don't really see what the harm is. If we ever see a SHA1 outside of a <revision>
, we really do want to strip it!
<text>
contains a sha1 attr, it does not have the same meaning as a separate sha1 tag. (cf. MediaWiki MCR and export 0.11)<text>
attr's (including sha1 attr, deleted attr, etc.), it only recognizes individual sha1 tags."<sha1>*</sha1>"
from the pipe is simple.:(
It's 2023, and wikia isn't wikia anymore.
Missing sha1 makes it impossible to import wikidump revisions dedupely.