papis / papis-zotero

Zotero compatibility layer for papis
GNU General Public License v3.0
75 stars 17 forks source link

Still problems with date (single year) and import zotero.sqlite #33

Closed m040601 closed 8 months ago

m040601 commented 1 year ago

$ python --version

Python 3.11.5

I'm on Archlinux

Linux  6.4.12-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 24 Aug 2023 00:38:14 +0000 x86_64 GNU/Linux

Papis installed from https://aur.archlinux.org/packages/papis

Name            : papis
Version         : 0.13-2
...
Depends On      : python  python-pyaml  python-arxiv2bib  python-beautifulsoup4
                  python-bibtexparser  python-certifi  python-chardet  python-click
                  python-colorama  python-dominate  python-filetype  python-habanero
                  python-isbnlib  python-lxml  python-prompt_toolkit  python-pygments
                  python-pyparsing  python-doi  python-slugify  python-requests  python-stevedore
                  python-tqdm  python-typing_extensions
Optional Deps   : papis-rofi: integration with rofi
                  python-whoosh
Required By     : papis-zotero
Installed Size  : 2.61 MiB
Build Date      : Sun 03 Sep 2023 02:57:29 AM WEST
Install Date    : Sun 03 Sep 2023 03:09:43 AM WEST
...

Sorry to open a new issue for this.

I've read and understood, https://github.com/papis/papis-zotero/issues/30 and https://github.com/papis/papis-zotero/pull/31 but it was closed and I'm having exactly the same problem,

I have plenty of records, where the date is just a year,

Yeap, just like me. Actually the huge majority of items only have year as date. Very different kind of records from a mix of different providers. So it can't be that there is something wrong with the formatting, extraneus characters or file corruption. They were all added and display perfectly well in Zotero.

I open Zotero. I have 13 items for testing on Zotero. Check everything displays nicely. No errors, no messages. I close Zotero.

Then,

papis zotero import -s ~/Zotero/

First 2 Items, Item 0 and 1 went OK

[INFO] papis_zotero.sql: [   0/13  ] Exporting item 'GURTUKXA' with ref 'IberianPeninsu2023' to folder '/home/a1/Testing/2023/ago23/papis-ago23/lib-1/GURTUKXA'.
[INFO] papis_zotero.sql: [   1/13  ] Exporting item '8VPW3J5A' with ref 'Madrid2023' to folder '/home/a1/Testing/2023/ago23/papis-ago23/lib-1/8VPW3J5A'.

those 2 were Wikipedia pages, and the date is displayed in Zotero for item "0" as,

It ends up in the GURTUKXA folder in the papis library folder. The info.yaml contains,

year: 2023

There is no "date" field in that info.yaml folder.

All the others fail. They all have date as just a year. Two log entries for each item. First an [ERROR] log entry than a [INFO] log entry.

[ERROR] papis_zotero.sql: Failed to parse date. (Caught exception 'ValueError: time data '2019-00' does not match format...'. Use `--log DEBUG` to see traceback)
[INFO] papis_zotero.sql: [   2/13  ] Exporting item 'CEKTKX3E' with ref 'MirandeseCryptTemkin' to folder '/home/a1/Testing/2023/ago23/papis-ago23/lib-1/CEKTKX3E'.

That item "2" was displayed in Zotero,

This will end up in a folder "CEKTKX3E" in my papis folder library. Inside there will be an info.yaml file with this field,

date: 2019-00-00 2019

There is no "year" field.

continuing, the same error thing will happen for the other items,

[ERROR] papis_zotero.sql: Failed to parse date. (Caught exception 'ValueError: time data '2011-00' does not match format...'. Use `--log DEBUG` to see traceback)
[INFO] papis_zotero.sql: [   3/13  ] Exporting item 'ZVRWB8AE' with ref 'LaUrcaDeCarvTemkin' to folder '/home/a1/Testing/2023/ago23/papis-ago23/lib-1/ZVRWB8AE'.
[ERROR] papis_zotero.sql: Failed to parse date. (Caught exception 'ValueError: time data '2009-00' does not match format...'. Use `--log DEBUG` to see traceback)
[INFO] papis_zotero.sql: [   4/13  ] Exporting item 'INP3HXMY' with ref 'TheDownfallOfTemkin' to folder '/home/a1/Testing/2023/ago23/papis-ago23/lib-1/INP3HXMY'.
[ERROR] papis_zotero.sql: Failed to parse date. (Caught exception 'ValueError: time data '2009-00' does not match format...'. Use `--log DEBUG` to see traceback)
[INFO] papis_zotero.sql: [   5/13  ] Exporting item 'SNWT3ZRP' with ref 'GasparCastanoTemkin' to folder '/home/a1/Testing/2023/ago23/papis-ago23/lib-1/SNWT3ZRP'.
[ERROR] papis_zotero.sql: Failed to parse date. (Caught exception 'ValueError: time data '2007-00' does not match format...'. Use `--log DEBUG` to see traceback)
[INFO] papis_zotero.sql: [   6/13  ] Exporting item 'INPE5E5T' with ref 'TheCryptoJewiTemkin' to folder '/home/a1/Testing/2023/ago23/papis-ago23/lib-1/INPE5E5T'.
[ERROR] papis_zotero.sql: Failed to parse date. (Caught exception 'ValueError: time data '2007-00' does not match format...'. Use `--log DEBUG` to see traceback)
[INFO] papis_zotero.sql: [   7/13  ] Exporting item 'FGK5XDHF' with ref 'LaCapitulacionTemkin' to folder '/home/a1/Testing/2023/ago23/papis-ago23/lib-1/FGK5XDHF'.
[ERROR] papis_zotero.sql: Failed to parse date. (Caught exception 'ValueError: time data '2006-00' does not match format...'. Use `--log DEBUG` to see traceback)
[INFO] papis_zotero.sql: [   8/13  ] Exporting item 'IE98QMBY' with ref 'LosMeritosYSTemkin' to folder '/home/a1/Testing/2023/ago23/papis-ago23/lib-1/IE98QMBY'.
[ERROR] papis_zotero.sql: Failed to parse date. (Caught exception 'ValueError: time data '2005-00' does not match format...'. Use `--log DEBUG` to see traceback)
[INFO] papis_zotero.sql: [   9/13  ] Exporting item 'LSR26UPR' with ref 'ElDescubrimienTemkin' to folder '/home/a1/Testing/2023/ago23/papis-ago23/lib-1/LSR26UPR'.
[ERROR] papis_zotero.sql: Failed to parse date. (Caught exception 'ValueError: time data '2011-00' does not match format...'. Use `--log DEBUG` to see traceback)
[INFO] papis_zotero.sql: [  10/13  ] Exporting item 'WIWW7LCU' with ref 'ArchivesEtHisNougar' to folder '/home/a1/Testing/2023/ago23/papis-ago23/lib-1/WIWW7LCU'.
[ERROR] papis_zotero.sql: Failed to parse date. (Caught exception 'ValueError: time data '2016-00' does not match format...'. Use `--log DEBUG` to see traceback)
[INFO] papis_zotero.sql: [  11/13  ] Exporting item '4WMPYVGU' with ref 'RunningCrabsCarval' to folder '/home/a1/Testing/2023/ago23/papis-ago23/lib-1/4WMPYVGU'.
[ERROR] papis_zotero.sql: Failed to parse date. (Caught exception 'ValueError: time data '2010-00' does not match format...'. Use `--log DEBUG` to see traceback)
[INFO] papis_zotero.sql: [  12/13  ] Exporting item 'VYQGAYPR' with ref 'EchinoideaFromPereir' to folder '/home/a1/Testing/2023/ago23/papis-ago23/lib-1/VYQGAYPR'.

Yeap. Pretty much the same error.

[INFO] papis_zotero.sql: Finished exporting from '/home/a1/Zotero/'.
[INFO] papis_zotero.sql: Exported files can be found at '/home/a1/Testing/2023/ago23/papis-ago23/lib-1'.

Let's turn on the --log DEBUG option.

papis --log DEBUG zotero import -s ~/Zotero/

It's always the same error message, "line 63, 568, 349 this and that" ...

For each item two log entries, first an [ERROR] entry then a [DEBUG] entry,

[ERROR] papis_zotero.sql: Failed to parse date.
  ┆ Traceback (most recent call last):
  ┆   File "/usr/lib/python3.11/site-packages/papis_zotero/sql.py", line 63, in get_fields
  ┆     d = datetime.strptime(date.split(" ")[0][:-3], "%Y-%m")
  ┆         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ┆   File "/usr/lib/python3.11/_strptime.py", line 568, in _strptime_datetime
  ┆     tt, fraction, gmtoff_fraction = _strptime(data_string, format)

  ┆                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  ┆   File "/usr/lib/python3.11/_strptime.py", line 349, in _strptime
  ┆     raise ValueError("time data %r does not match format %r" %
  ┆ ValueError: time data '2019-00' does not match format '%Y-%m'
[DEBUG] bibtex: Generated ref 'Mirandese crypt Temkin '.

So they do end up physically as files and folders in my papis folder library. Except that they dont show up in papis

There are also some things that maybe for the developer that works with the code everyday are "obvious" or "easy", but that I find unintuitive and confusing as an end user.

This is not intended as a negative comment or critic. I very much appreciate all the work and effort being put in this tool. It might be a personal nitpick ... but here is my feedback as an end user,

  1. starting at zero instead of one.
[INFO] papis_zotero.sql: [   0/13  ]
...
...
[INFO] papis_zotero.sql: [   12/13  ]

Why start at "0" and not at "1" ? The first will be "0" and the last one will be "12". Instead of the first being "1" and the last being "13". I know, I am trying to import 13 items and follow up the procedure. This just makes it so unconfortable to read the debug log.

  1. First an [ERROR] log entry than a [INFO] log entry.

When not using the --log DEBUG option, if there is an error, you get, for each item, first a "negative" [ERROR] log entry than a "positive" [INFO] log entry.

This is also confusing. Makes mentally parsing the log difficult. At the beginning I wasnt even sure what happened. I asked my self, what does this mean ? Did things went OK or not ? Was it a "sucess" or not ? Oh ... , there is the word "ERROR" there, so things must have stopped, and nothing must have happened. But. It did. Even with the error the items were "kind of imported".

This is confusing.

Were talking about importing data here. It doesnt matter what the end result is. Either succes or failure. Or something in between.

The important thing is that the end user should be 100% sure of what happened. Be 100% sure of what is the current state of the data is on the system.

Assuming he has backups. He should decide and the be able to rollback, delete everything, start over again.

I would expect something like this sequence of log entries

[INFO] Trying to export item XYZ (a nice  first neutral message)
[INFO] Sucessfully exported  item XYZ to folder foo/bar
[MAYBE] Some empty line here ? visually nice to separate
[INFO] Trying to export item ABC (a nice first neutral message)
[INFO] Sucessfully exported  item ABC to folder foo/bar

or in case something "bad" happens

[INFO] Trying to export item XYZ (a nice first neutral message)
[ERROR] Something bad happened .... bla bla bla ...
[INFO] Item XYZ exported to folder foo/bar with possible errors ... check blablabla ... some fields might be "wrong"
[MAYBE] Some empty line here ? visually nice to separate
[INFO] Trying to export item ABC (a nice first neutral message)
  1. Importing already "imported" items.

I find it unintuitive that if you are running the import again, and you already have those items as folders/files on your system and they are probably being written over there is no "WARNIG" or "CHOICE" about this. I'm not sure what to recommend. Maybe something like

[INFO] Trying to export item XYZ (a nice  first neutral message)
[WARNING] Item XYZ already in the papis folder foo/bar. 
[INFO] Skipped trying to import item XYZ. Use --force to rewrite those folders/files
[INFO] Trying to export item XYZ (a nice  first neutral message)

and when using that "force",

[INFO] Trying to export item XYZ (a nice  first neutral message)
[WARNING] Item XYZ already in the papis folder foo/bar. 
[INFO] Force option used. folder/fileXYZ overwritten
[INFO] Sucessfuly imported item XYZ.
[INFO] Trying to export item XYZ (a nice  first neutral message)
  1. "Year" versus "Date" in the info.yaml file.

Sometimes the imported item ends up with "date", sometimes with the "year". Also find this confusing.

  1. Import and no "refresh"

Yeah, I'm afraid the SQLite importer just writes out the info.yaml files and doesn't let papis known it needs to update the database #32

I also find this totally not user friendly. At least a message stdout or stderr should be produced.

Already reported in, https://github.com/papis/papis-zotero/issues/32

alexfikl commented 1 year ago

That issue (#30) was only fixed on main and is not yet released. Can you try out installing directly from git? If you're on Arch, it shouldn't be too hard to add a patch to the papis-zotero PKGBUILD or use pip (shouldn't install globally though!).

For the rest of your points: they all seem very reasonable and should be fixed in some way! Can you make separate issues for each one so they don't get lost? Just copy pasting the text from your 1 to 5 points would be great.

Panadestein commented 1 year ago

Hi, thanks for this very useful project! I want to leave here a nix flake I've made in case somebody using the pypi release is facing these sqlite import errors. I tested in nixpkgs-unstable and I am able to import my full Zotero library without errors.

{
  description = "Build papis-zotero";

  inputs.nixpkgs.url = "github:NixOS/nixpkgs/nixos-unstable";

  outputs = { self, nixpkgs }:
    let
      pkgs = nixpkgs.legacyPackages.x86_64-linux;

      papis-zotero-custom = pkgs.python3Packages.buildPythonPackage rec {
        version = "0.1.2";
        pname = "papis-zotero";

        src = pkgs.fetchFromGitHub {
          owner = "papis";
          repo = "papis-zotero";
          rev = "20a50ebbcb115fdddcbc922b4535f5c6c9f0e7b0";
          sha256 = "0242mz5dvv2nj91lsc81779y8ad1xs698crgv654wws96ll8knn9";
        };

        doCheck = false;
        propagatedBuildInputs = [
          pkgs.papis
          pkgs.python3Packages.papis
        ];

        meta = {
          homepage = "https://github.com/papis/papis-zotero";
          description = "Zotero support for papis";
        };
      };

    in
      {
        devShell.x86_64-linux = pkgs.mkShell {
          buildInputs = [
            papis-zotero-custom
          ];
        };
      };
}

Then enter the environment with nix develop .#devShell.x86_64-linux. Cheers.

mbrunnen commented 1 year ago

Seems to be solved with 20a50ebbcb115fdddcbc922b4535f5c6c9f0e7b0

m040601 commented 1 year ago

Seems to be solved with 20a50eb

Nice. I havent tested this my self yet, since I will be waiting for the release version.

I dont want to mess my system with "pip" or patches.

PS: I also tried installing "papis-zotero" with "pipx"

Not "pip", "pipx". It is a tool I like very much, because it leaves your system clean and isolated from messing around.

But it is more catering to "apps", not so much "libs". So unfortunately it doesnt work with "papis-zotero".

$ pipx install papis-zotero

No apps associated with package papis-zotero. ...
 If you are attempting to install a library, pipx should not be
used. Consider using pip or a similar tool instead.
alexfikl commented 8 months ago

Thanks everyone for testing! I'll go ahead and close this, since the fix seems to work and it will be in the next release.