TheLastGimbus / GooglePhotosTakeoutHelper

Script that organizes the Google Takeout archive into one big chronological folder
https://aur.archlinux.org/packages/gpth-bin
Apache License 2.0
3.88k stars 191 forks source link

Make date extraction more modular #112

Closed tkuenzle closed 1 year ago

tkuenzle commented 2 years ago

First of all, thanks a lot @TheLastGimbus for coming up with this nice script, I have found it to be very useful!

Similar to #111 I realized that there are a few files where this app is not able to infer the correct date but upon further inspection, it looks like it would not be hard at all to add them to this script.

When I took a look at the code however, I identified two problems:

In order to address these two issues and make the script much easier to extend in the future, I would like to propose the following change:

Move all the logic for date extraction into a separate date_extractors module which could look like this:

class DateExtractor:
  def extract_date(self, file_path: Path, ) -> datetime | None:
      raise NotImplementedError

class ExifDateExtractor(DateExtractor):
  def extract_date(self, file_path: Path, ) -> datetime | None:
    # implementation for extracting the date from the EXIF data 

class FileNameDateExtractor(DateExtractor):
  def extract_date(self, file_path: Path, ) -> datetime | None:
    # implementation for extracting the date from the file name 

...

We could then define a list of extractors (and even make this configurable through a command line argument) that specifies the all the extractors that we would like to apply in order of priority. Getting the date would then be as simple as

extractors = [ExifDateExtractor(), FileNameDateExtractor()]

date = next(
    extracted_date
    for extractor in extractors
    if (extracted_date := extractor(file_path)) is not None
)

Such an implementation would nicely separate the extraction logic from priority and make it very easy for other people to add their own extractors.

If you think this could be a good idea and would be open for such a change, I would be happy to discuss the details with you and come up with a PR.

TheLastGimbus commented 2 years ago

hmmm, this seems nice

altough, generally, i'm tired of this script because it's messy+spaghetty+hack-ish. Currently I'm thinking of re-writing it in Dart or something. Tho I will take this to consideration if i stick with Python

TheLastGimbus commented 1 year ago

Another issue resolved by v3 :tada: