sylikc / pyexiftool

PyExifTool (active PyPI project) - A Python library to communicate with an instance of Phil Harvey's ExifTool command-line application. Runs one process with special -stay_open flag, and pipes data to/from. Much more efficient than running a subprocess for each command!
Other
148 stars 19 forks source link

V0.5.x py3 refactor #13

Closed sylikc closed 2 years ago

sylikc commented 3 years ago

Alright, this is still being developed, but I figured I'd get discussion on the code via a PR to myself. Therefore people can review code and make comments directly to code.

The base ExifTool class is pretty much complete. I'm planning to add logging functionality to help users debug what's going raw in/out of exiftool process, but that's about it.

sylikc commented 3 years ago

Looks great to me! I haven't had time to test it with my code to see if it breaks any of my use cases, but I don't see any obvious issues at the moment. I think the break apart into the low and high level classes will work well for maintainability. I left a few comments, all of which I admit are pretty nitpicky.

@csparker247 Thanks for reviewing it. I appreciate the comments and different ways to look at the usabilitiy.

I haven't had time to address the Helper class yet. There's lots of things to fix that up with. I brought up a new version of the github pages, but the documentation is wayyyyy out-of-date.

Anyhow, I think that base ExifTool class is pretty much complete. There's some small things here and there, but the interface shouldn't change much. (I'm currently having a windows process-related problem which I'm trying to determine if it's a reportable bug to exiftool or just a limitation of the way the windows exe is packaged).

sylikc commented 3 years ago

I have always (lazily) taken care of my responsibility for deleting the temporary directory by eventually rebooting the machine.

That approach allows me to essentially do what you do, too, and look at the data if something goes wrong. Using tempfile feels safer because a new test run will not inadvertently use temporary data from a previous run, which could mess up your tests when you skip removing (unlinking) our TMP_DIR.

Maybe I am missing something, though, and I am intrigued by how β€œmkdtemp complicates the debugging workflow when the tests fail”.

I am already looking forward to your response once I push tests involving the default cache. πŸ˜†

I'm back to developing ExifToolHelper as I realize the personal project that I have used ExifTool before the Helper came about. I'm looking forward to more tests and getting back on this. @jangop

(I found the method for Python creating and deleting its own temporary directories)

I am reconsidering using mkdtemp(), but when I say it complicates the debugging workflow, I mean that... when the tests fail, I would like to look at the output (which is in the tmp_dir) and see exactly why they failed... aka use exiftool to analyze those files. If I use the default Python stuff, and it just deletes the output, I would have to take extra steps to generate the same output to debug myself.

sylikc commented 3 years ago

@jangop what's the best way to support multiple incompatible versions in PyPI and on a project like this?

I think, even though the py3 refactor branch isn't quite mature yet, and there's a lot of borkiness with it, I want to push it forward rather than later. I certainly use it actively. I found the execute_json() behavior is kinda broken when you use it to update tags. Exiftool doesn't return anything and so execute_json() will always return None, or such.

Like wrapping exiftool isn't trivial. I find my own implementation of scripts nowadays extends ExifToolHelper() which can call the execute() and execute_json() when necessary.

Now that I'm a bit better with Python, I might disable a lot of the poorly written code that I merged in the original PR 11 https://github.com/smarnach/pyexiftool/pull/11 (which I kinda just moved all verbatim into ExifToolHelper) since the code is... well, it's super fragile, and I certainly don't use it, and it's hard to tell if anyone else is using it.

I certainly am still weak with the tests and linting and stuff... typing things is ugh so hard. But I also don't want to hold back the ExifToolHelper() work just because the rest of it has to catch up

Ideas for moving forward

Say, if I created a v0.4 branch, which will be bugfixes only against the v0.4 stuff, and then merge this branch into master. Then just having it starting with v0.5.x in PyPI? Just direct people to install that version if they want the latest smarnach original with bugfixes = v0.4.x? and then Python3 only = v0.5.x?

jangop commented 3 years ago

If you want to support diverging branches, I don't know. Probably different projects. 🀷

But I think you are referring to something like a pre-release, and that is certainly doable. It would then be installed via pip install --pre pyexiftool. See https://pip.pypa.io/en/stable/cli/pip_install/#install-pre, https://pip.pypa.io/en/stable/cli/pip_install/#pre-release-versions, and https://www.python.org/dev/peps/pep-0440/#handling-of-pre-releases; also https://semver.org/.

As I've stated before somewhere, I see no point in supporting Python 2. There will always exist out-of-date versions of pyexiftool for that out-of-date version of Python. Because this refactor is geared towards future, it should require Python 3.9, if only because that makes type hinting so much more convenient.

Regarding testing and linting, I would not overdo it. This is a small project. Automate the tests, run black and have it check commits automatically; done. That should make life easier, not harder. ☺️ I'll try to provide a script.

csparker247 commented 3 years ago

Maybe I'm confusing the issue, but can't you just push both 0.4.x and 0.5.x packages to PyPI at any time and let users decide in their requirements which one to download? PyPI doesn't have a monotonically increasing version requirement, does it?

Edit: Reading through again, maybe this is what you're proposing? I think this is a reasonable use of a versioning scheme that's not that confusing.

sylikc commented 3 years ago

Thanks for the support you two!

Yeah, that's the plan. I'm planning to ...

  1. Clean up the py3 v0.5.x branch a bit further (like the TMP_DIR stuff from above)
  2. Probably moving functionality that is "not thoroughly tested" like the PR 11 stuff (I merged that when I was a bit novice to Python) to a class extending ExifToolHelper. (This change could allow someone who depended on that functionality to use that subclass, while I build out this hierarchy of increasingly extended functionality)
  3. "Try" to update the documentation a bit regarding the ExifTool base class and whatever functionality I end up leaving in ExifToolHelper
  4. Create a branch from master, probably called v0.4-legacy or something of the sort. That will be a bugfix only branch in case someone doesn't want to convert to the new structure and wants some bugfix stuff in the old code. I don't expect it to happen, but who knows.
  5. Place this PR in Ready for Review, and with approval merge this into master.
  6. Push the v0.5.x version into PyPI
  7. Probably update the README to indicate there's two versions to pull in case someone is so inclined. But having moved my own projects from the old to the new, I've found the functionality in the ExifToolHelper indispensable, so I couldn't imagine new users wanting to use the old version

While I'm at it, do you like the direction? Is this something that is easy to work into your own projects? Would it break a lot of your existing code or help it?

csparker247 commented 3 years ago

Yeah, this is fine for me. I'll pretty much endlessly target the highest versioned PyPI release, and change my requirements file if you introduce some breaking change that I don't yet have time to support.

I would suggest just calling your bugfix branch v0.4.x. I'm not sure legacy adds anything to the name. And save yourself the headache and only patch that branch when people have issues and can't upgrade to 0.5+ (i.e. don't try to cherry pick patches from the main branch). I concur with @jangop on that point.

sylikc commented 3 years ago

@csparker247 FYI ...

So I changed the function signature of get_tags() ... I'm going to make all future functions in the Helper class be files first parameter instead of having it somewhere near the end. I think it makes more logical sense, and matches how other libraries handle stuff like this.

files, , params=None

That's how the function signature will be when I pull in the other "experimental" functionality into the Helper (set_tags(), keywords, etc). I probably will merge this before that finally comes in.

I still have endless tests to write ....................

jangop commented 3 years ago

I'll try to provide a script.

I have provided a respective pull request by now. Please take a look at it.

jangop commented 3 years ago

Some of the tests are currently failing, right?

sylikc commented 3 years ago

@jangop I have a question for you regarding encodings and such, I don't know how to message on GitHub ... but I'm going to ask it on https://github.com/sylikc/pyexiftool/issues/29 as it's related to that.

sylikc commented 2 years ago

@jangop I'll probably merge this super big version change when I can get all the tests to pass... and actually I don't know why it's failing lol

They pass when I run it on my dev environment 😨

sylikc commented 2 years ago

oh. It must be the exiftool version ... (referring comment https://github.com/sylikc/pyexiftool/pull/32#issuecomment-1026076152 )

I need a newer version than it is building with ... how do I do that with GitHub actions? On my linux box, I built with source

jangop commented 2 years ago

In principle, Github's runner should also be able to build from source, just like you do on your β€œlinux box”.

sylikc commented 2 years ago

πŸ€— I got the tests to pass! Do you think this is ready for prime time? @jangop @csparker247

sylikc commented 2 years ago
jangop commented 2 years ago

hugs I got the tests to pass! Do you think this is ready for prime time? @jangop @csparker247

Awesome, great work. πŸ‘

We should make sure the additional 3 tests that are currently expected to fail no longer fail. And see that the documentation makes sense.

sylikc commented 2 years ago

for sure the documentation doesn't make sense... lol and I see residual Python 2 code I'll be removing later

I've been updating "some" of the documentation in the sense of files like changelog and readme and compatibility, but for sure even some of the comments i had to update which aren't really up to the features that were changed or added

csparker247 commented 2 years ago

πŸ‘ πŸŽ‰ Seems great. You've poured lots of time into this! Nice job!

sylikc commented 2 years ago

Alright the time is upon us... for me to break everyone's existing code on upgrade! πŸ˜πŸ˜‚πŸ˜Ž

I'm working on documentation update... and maybe even a GitHub Action to publish it automatically (when it's actually acceptable quality). I'm going to push a few more documentation updates to this branch and it will merge in a little bit

sylikc commented 2 years ago

Sheesh, that just doesn't look like 128 commits and a year of updates... but it's merged 😎 release coming shortly

I appreciate both of you who have helped me along this journey! Can't thank you enough to help drive some good design decisions during the refactor!

$ git merge --no-ff v0.5.x-py3-refactor

Merge made by the 'ort' strategy.
 .github/workflows/lint-and-test.yml |   23 +-
 .gitignore                          |    9 +
 CHANGELOG.md                        |   14 +-
 COMPATIBILITY.txt                   |   55 ++
 LICENSE                             |    2 +-
 README.rst                          |  195 +++++-
 doc/index.rst                       |    8 -
 {doc => docs}/.gitignore            |    0
 {doc => docs}/Makefile              |    1 +
 docs/make.bat                       |   35 +
 {doc => docs/source}/conf.py        |   59 +-
 docs/source/index.rst               |   39 ++
 exiftool/__init__.py                |   18 +-
 exiftool/constants.py               |   80 +++
 exiftool/exceptions.py              |   51 ++
 exiftool/exiftool.py                | 1283 +++++++++++++++++++++--------------
 exiftool/experimental.py            |  355 ++++++++++
 exiftool/helper.py                  |  231 +++++++
 mypy.ini                            |    2 +
 scripts/README.txt                  |    7 +
 scripts/flake8.bat                  |   21 +
 scripts/flake8.ini                  |    9 +
 scripts/flake8_requirements.txt     |    6 +
 scripts/mypy.bat                    |   16 +
 scripts/mypy_reqiuirements.txt      |    1 +
 scripts/pytest.bat                  |   19 +
 scripts/pytest_requirements.txt     |    4 +
 scripts/unittest.bat                |   12 +
 scripts/windows.coveragerc          |    9 +
 setup.cfg                           |    3 +
 setup.py                            |   45 +-
 tests/test_alpha.py                 |  241 +++++++
 tests/test_exiftool.py              |  434 +++++++-----
 tests/test_helper.py                |  154 +++++
 34 files changed, 2671 insertions(+), 770 deletions(-)
 create mode 100644 COMPATIBILITY.txt
 delete mode 100644 doc/index.rst
 rename {doc => docs}/.gitignore (100%)
 rename {doc => docs}/Makefile (99%)
 create mode 100644 docs/make.bat
 rename {doc => docs/source}/conf.py (88%)
 create mode 100644 docs/source/index.rst
 create mode 100644 exiftool/constants.py
 create mode 100644 exiftool/exceptions.py
 create mode 100644 exiftool/experimental.py
 create mode 100644 exiftool/helper.py
 create mode 100644 mypy.ini
 create mode 100644 scripts/README.txt
 create mode 100644 scripts/flake8.bat
 create mode 100644 scripts/flake8.ini
 create mode 100644 scripts/flake8_requirements.txt
 create mode 100644 scripts/mypy.bat
 create mode 100644 scripts/mypy_reqiuirements.txt
 create mode 100644 scripts/pytest.bat
 create mode 100644 scripts/pytest_requirements.txt
 create mode 100644 scripts/unittest.bat
 create mode 100644 scripts/windows.coveragerc
 create mode 100644 tests/test_alpha.py
 create mode 100644 tests/test_helper.py
sylikc commented 2 years ago

have to restore the branch as it's referred to in https://github.com/sylikc/pyexiftool/pull/39