papis / papis-zotero

Zotero compatibility layer for papis
GNU General Public License v3.0
75 stars 17 forks source link

Problems with date #30

Closed mcepl closed 1 year ago

mcepl commented 1 year ago

I have plenty of records, where the date is just a year, for example this one:

<rdf:RDF
 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
 xmlns:z="http://www.zotero.org/namespaces/export#"
 xmlns:dc="http://purl.org/dc/elements/1.1/"
 xmlns:vcard="http://nwalsh.com/rdf/vCard#"
 xmlns:foaf="http://xmlns.com/foaf/0.1/"
 xmlns:bib="http://purl.org/net/biblio#"
 xmlns:dcterms="http://purl.org/dc/terms/"
 xmlns:prism="http://prismstandard.org/namespaces/1.2/basic/">
    <bib:Book rdf:about="urn:isbn:978-80-7367-860-9">
        <z:itemType>book</z:itemType>
        <dc:publisher>
            <foaf:Organization>
                <vcard:adr>
                    <vcard:Address>
                       <vcard:locality>Praha</vcard:locality>
                    </vcard:Address>
                </vcard:adr>
                <foaf:name>Portál</foaf:name>
            </foaf:Organization>
        </dc:publisher>
        <bib:authors>
            <rdf:Seq>
                <rdf:li>
                    <foaf:Person>
                        <foaf:surname>Halík</foaf:surname>
                        <foaf:givenName>Tomáš</foaf:givenName>
                    </foaf:Person>
                </rdf:li>
            </rdf:Seq>
        </bib:authors>
        <bib:contributors>
            <rdf:Seq>
                <rdf:li>
                    <foaf:Person>
                        <foaf:surname>Dostatni</foaf:surname>
                        <foaf:givenName>Tomasz</foaf:givenName>
                    </foaf:Person>
                </rdf:li>
            </rdf:Seq>
        </bib:contributors>
        <dcterms:isReferencedBy rdf:resource="#item_368"/>
        <dcterms:isReferencedBy rdf:resource="#item_367"/>
        <dc:subject>20.-21. stol. \textbar2 czenas</dc:subject>
        <dc:subject>20.-21. stol. \textbar7 ch462155 \textbar2 czenas</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>20.-21. stol. |2 czenas</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>20.-21. stol. |7 ch462155 |2 czenas</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>20th-21st centuries \textbar2 eczenas</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>20th-21st centuries |2 eczenas</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>ateismus \textbar7 ph118651</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>ateismus |7 ph118651</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>atheism</dc:subject>
        <dc:subject>Bůh a člověk \textbar7 ph116953 \textbar2 czenas</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>Bůh a člověk |7 ph116953 |2 czenas</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>Catholic priests</dc:subject>
        <dc:subject>Czechia</dc:subject>
        <dc:subject>Česko</dc:subject>
        <dc:subject>Halík</dc:subject>
        <dc:subject>
           <z:AutomaticTag><rdf:value>Halík, Tomáš</rdf:value></z:AutomaticTag>
        </dc:subject>
        <dc:subject>intelektuálové \textbar7 ph121139</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>intelektuálové |7 ph121139</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>intellectuals</dc:subject>
        <dc:subject>interviews \textbar2 eczenas</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>interviews |2 eczenas</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>katoličtí kněží \textbar7 ph114899</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>katoličtí kněží |7 ph114899</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>náboženská víra \textbar7 ph115476</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>náboženská víra |7 ph115476</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>názory a postoje \textbar7 ph137634 \textbar2 czenas</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>názory a postoje |7 ph137634 |2 czenas</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>pluralism (religion)</dc:subject>
        <dc:subject>pluralism (society)</dc:subject>
        <dc:subject>pluralismus (náboženství) \textbar7 ph281886</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>pluralismus (náboženství) |7 ph281886</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>pluralismus (společnost) \textbar7 ph135655</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>pluralismus (společnost) |7 ph135655</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>postmodern society \textbar2 eczenas</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>postmodern society |2 eczenas</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>postmoderní společnost \textbar7 ph124410 \textbar2 czenas</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>postmoderní společnost |7 ph124410 |2 czenas</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>relations between God and man \textbar2 eczenas</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>relations between God and man |2 eczenas</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>religious faith</dc:subject>
        <dc:subject>rozhovory \textbar7 fd133303 \textbar2 czenas</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>rozhovory |7 fd133303 |2 czenas</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>secularization</dc:subject>
        <dc:subject>sekularizace \textbar7 ph125471</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>sekularizace |7 ph125471</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:subject>Tomáš</dc:subject>
        <dc:subject>views and attitudes \textbar2 eczenas</dc:subject>
        <dc:subject>
            <z:AutomaticTag>
               <rdf:value>views and attitudes |2 eczenas</rdf:value>
            </z:AutomaticTag>
        </dc:subject>
        <dc:title>Tomáš Halík: smířená různost: rozhovor</dc:title>
        <dc:date>2011</dc:date>
        <z:shortTitle>Tomáš Halík</z:shortTitle>
        <z:libraryCatalog>aleph.nkp.cz Library Catalog</z:libraryCatalog>
        <dc:subject>
           <dcterms:LCC><rdf:value>272-726.3 |2 MRF</rdf:value></dcterms:LCC>
        </dc:subject>
        <dc:description>00000</dc:description>
        <dc:identifier>ISBN 978-80-7367-860-9</dc:identifier>
        <prism:edition>Vyd. 1</prism:edition>
        <z:numPages>211</z:numPages>
    </bib:Book>
    <bib:Memo rdf:about="#item_368">
       <rdf:value>&lt;p&gt;halik:2011tomas&lt;/p&gt;</rdf:value>
    </bib:Memo>
    <bib:Memo rdf:about="#item_367">
       <rdf:value>&lt;p&gt;Obsahuje rejstřík&lt;/p&gt;</rdf:value>
    </bib:Memo>
</rdf:RDF>

When trying to import I get this on the stderr:

[INFO] papis_zotero.sql: [   0/199 ] Exporting item 'A9AW7P8Z' with ref 'DeclarationDoRatzin2000' to folder '/home/matej/Dokumenty/clanky/bibliography/A9AW7P8Z'.
[ERROR] papis_zotero.sql: Failed to parse date.
  ┆ Traceback (most recent call last):
  ┆   File "/home/matej/.local/lib/python3.11/site-packages/papis_zotero/sql.py", line 63, in get_fields
  ┆     d = datetime.strptime(date.split(" ")[0][:-3], "%Y-%m")
  ┆         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ┆   File "/usr/lib64/python3.11/_strptime.py", line 568, in _strptime_datetime
  ┆     tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  ┆                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ┆   File "/usr/lib64/python3.11/_strptime.py", line 349, in _strptime
  ┆     raise ValueError("time data %r does not match format %r" %
  ┆ ValueError: time data '2011-00' does not match format '%Y-%m'
[DEBUG] bibtex: Generated ref 'Tomáš Halík: sm Halík, '.

Couldn’t the module somehow recognize that date matching the regular expression \d+ won’t be matched by the strptime format %Y-%m?

Using papis-zotero 0.1.2 from PyPI.

alexfikl commented 1 year ago

31 takes a stab at this using a regex like suggested. Can you give it a try to see if it works on your case?