scieloorg / document-store-migracao

Document Store (Kernel) - Migração
BSD 2-Clause "Simplified" License
1 stars 6 forks source link

[ds-migracao][conversao] Ajustes na conversão de email #75

Open robertatakenaka opened 5 years ago

robertatakenaka commented 5 years ago

Converter:

1) De

<a href="mailto:email@domain.com">email@domain.com</a>

para

<email>email@domain.com</email>

  1. De:
    <a href="mailto:email@domain.com"><img src="email.gif"/></a>

    para:

    <graphic xlink:href="email.gif"><email>email@domain.com</email></graphic>

    Exemplo real: http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0100-879X2011000100001&lng=en&nrm=iso&tlng=en


  1. De:
    <a href="mailto:nuesslin@lrz.tum.de">Enviar e-mail para autor</a>

    para:

    <email xlink:href="mailto:nuesslin@lrz.tum.de">
    Enviar e-mail para autor</email>

    ou

    <ext-link ext-link-type="email" xlink:href="mailto:nuesslin@lrz.tum.de">
    Enviar e-mail para autor</ext-link>

  1. De:
    <a href="mailto:nuesslin@lrz.tum.de">Enviar e-mail para nuesslin@lrz.tum.de</a>

    para:

    Enviar e-mail para <email>nuesslin@lrz.tum.de</email>

    ou

    <ext-link ext-link-type="email" xlink:href="mailto:nuesslin@lrz.tum.de">
    Enviar e-mail para nuesslin@lrz.tum.de</ext-link>

Referências:

EMAIL https://jats.nlm.nih.gov/publishing/tag-library/1.2/element/email.html Attributes content-type Type of Content id Document Internal Identifier specific-use Specific Use xlink:actuate Actuating the Link xlink:href Href (Linking Mechanism) xlink:role Role of the Link xlink:show Showing the Link xlink:title Title of the Link xlink:type Type of Link xml:base Base xml:lang Language xmlns:xlink XLink Namespace Declaration

Content Model

<!ELEMENT  email        (#PCDATA %email-elements;)*                  >

Expanded Content Model

(#PCDATA)*

GRAPHIC https://jats.nlm.nih.gov/publishing/tag-library/1.2/element/graphic.html Attributes content-type Type of Content id Document Internal Identifier mime-subtype Mime Subtype mimetype Mime Type orientation Orientation position Position specific-use Specific Use xlink:actuate Actuating the Link xlink:href Href (Linking Mechanism) xlink:role Role of the Link xlink:show Showing the Link xlink:title Title of the Link xlink:type Type of Link xml:base Base xml:lang Language xmlns:xlink XLink Namespace Declaration

Content Model

<!ELEMENT  graphic      %graphic-model;                              >

Expanded Content Model

(alt-text | long-desc | abstract | email | ext-link | uri | caption | object-id | kwd-group | label | attrib | permissions)*

EXT-LINK https://jats.nlm.nih.gov/publishing/tag-library/1.2/element/ext-link.html Attributes assigning-authority Authority Responsible for an Identifier ext-link-type Type of External Link id Document Internal Identifier specific-use Specific Use xlink:actuate Actuating the Link xlink:href Href (Linking Mechanism) xlink:role Role of the Link xlink:show Showing the Link xlink:title Title of the Link xlink:type Type of Link xml:base Base xml:lang Language xmlns:xlink XLink Namespace Declaration

Content Model

<!ELEMENT  ext-link     (#PCDATA %ext-link-elements;)*               >

Expanded Content Model

(#PCDATA | bold | fixed-case | italic | monospace | overline | roman | sans-serif | sc | strike | underline | ruby | named-content | styled-content | sub | sup)*
gustavofonseca commented 4 years ago

Um novo caso [http://articlemeta.scielo.org/api/v1/article/?collection=scl&code=S0101-31222006000300005&format=xmlrsps]:

De:

<email>&lt;A HREF="mailto:lbfranke@vortex.ufrgs.br"&gt;lbfranke@vortex.ufrgs.br&lt;/A&gt;</email>

Para:

<email>lbfranke@vortex.ufrgs.br</email>
gustavofonseca commented 4 years ago

Não estou seguro se depois de reabrir o ticket ele deve ser colocado novamente no backlog. O que vc diria @patymori?

joffilyfe commented 4 years ago

Um novo caso [http://articlemeta.scielo.org/api/v1/article/?collection=scl&code=S0101-31222006000300005&format=xmlrsps]:

De:

<email>&lt;A HREF="mailto:lbfranke@vortex.ufrgs.br"&gt;lbfranke@vortex.ufrgs.br&lt;/A&gt;</email>

Para:

<email>lbfranke@vortex.ufrgs.br</email>

Este problema foi referenciado pela issue e resolvido com o PR https://github.com/scieloorg/articles_meta/pull/200.


Para o exemplo real citado em [1], apesar da conversão ser executada, há erros na validação do resultado, segue o log:

2020-01-14 15:02:56 ERROR [documentstore_migracao.processing.validation] Element inline-graphic is not declared in xref list of possible children - 12
2020-01-14 15:02:56 ERROR [documentstore_migracao.processing.validation] Element inline-graphic is not declared in ext-link list of possible children - 9

APESAR do XML não estar válido da perspectiva da DTD o resultado da transformação pelo packtools é suficientemente bom, ver:

Screen Shot 2020-01-14 at 15 21 37

[1] Exemplo real

Exemplo real: http://www.scielo.br/scielo.php?script=sci_arttext&pid=S0100-879X2011000100001&lng=en&nrm=iso&tlng=en