wpoa / open-access-media-importer

A tool for harvesting media files from Open Access articles for upload into Wikimedia Commons
http://commons.wikimedia.org/wiki/User:Open_Access_Media_Importer_Bot
23 stars 8 forks source link

Keep non-ASCII characters in description #74

Closed Daniel-Mietchen closed 10 years ago

Daniel-Mietchen commented 11 years ago

We purposefully remove non-ASCII characters from the file names on Commons, but apparently - and without any purpose apparent to me - also from the article titles or author names in the description.

Example: http://commons.wikimedia.org/wiki/File:-Arrestin2-Regulates-Lysophosphatidic-Acid-Induced-Human-Breast-Tumor-Cell-Migration-and-Invasion-pone.0056174.s003.ogv

erlehmann commented 10 years ago

None of the names in the

<contrib-group>
element of this article seems to contain non-ASCII characters. Closing bug for that reason.

      <contrib-group>
        <contrib contrib-type="author">
          <name>
            <surname>Alemayehu</surname>
            <given-names>Mistre</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">
            <sup>1</sup>
          </xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Dragan</surname>
            <given-names>Magdalena</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">
            <sup>1</sup>
          </xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Pape</surname>
            <given-names>Cynthia</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">
            <sup>1</sup>
          </xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Siddiqui</surname>
            <given-names>Iram</given-names>
          </name>
          <xref ref-type="aff" rid="aff2">
            <sup>2</sup>
          </xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Sacks</surname>
            <given-names>David B.</given-names>
          </name>
          <xref ref-type="aff" rid="aff6">
            <sup>6</sup>
          </xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Di Guglielmo</surname>
            <given-names>Gianni M.</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">
            <sup>1</sup>
          </xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Babwah</surname>
            <given-names>Andy V.</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">
            <sup>1</sup>
          </xref>
          <xref ref-type="aff" rid="aff3">
            <sup>3</sup>
          </xref>
          <xref ref-type="aff" rid="aff4">
            <sup>4</sup>
          </xref>
          <xref ref-type="aff" rid="aff5">
            <sup>5</sup>
          </xref>
        </contrib>
        <contrib contrib-type="author">
          <name>
            <surname>Bhattacharya</surname>
            <given-names>Moshmi</given-names>
          </name>
          <xref ref-type="aff" rid="aff1">
            <sup>1</sup>
          </xref>
          <xref ref-type="corresp" rid="cor1">
            <sup>*</sup>
          </xref>
        </contrib>
      </contrib-group>