RhetTbull / osxphotos

Python app to work with pictures and associated metadata from Apple Photos on macOS. Also includes a package to provide programmatic access to the Photos library, pictures, and metadata.
MIT License
2.04k stars 96 forks source link

`detected_text` causes `segmentation fault` and crashes python runtime #1081

Open ces3001 opened 1 year ago

ces3001 commented 1 year ago

Describe the bug Using detected_text whether as a {detected_text} template substitution in the command-line, or within python code such as dtext = p.detected_text(confidence_threshold=0.5) will cause segmentation fault crashes to the python runtime. No further error information is available. It doesn’t crash in the same place every time.

To Reproduce Steps to reproduce the behavior:

Expected behavior Get the detected text without crashing. This works most of the time, but will crash at some point every time when doing this over hundreds or thousands of photos.

Desktop (please complete the following information):

RhetTbull commented 1 year ago

This sounds like it could be a memory leak in the detected_text code which is actually written in Objective-C and called from python. Will take a look when I get a chance.

RhetTbull commented 1 year ago

@ces3001 I'll look at this bug but if you're running on Ventura, you can skip using osxphoto's detected text feature and just access the text that Photos has already detected. Results won't be 100% identical as Photos' uses a higher confidence level but it should be close. Change your keyword template to:

--keyword-template "{photo.search_info.detected_text?text:{photo.search_info.detected_text},}"

This accesses the search_info property of a photo to directly get the detected text from the database. It will be much faster too as the text detection won't need to take place at export time.

RhetTbull commented 1 year ago

@all-contributors please add @ces3001 for bug

allcontributors[bot] commented 1 year ago

@RhetTbull

I've put up a pull request to add @ces3001! :tada:

ces3001 commented 1 year ago

@ces3001 I'll look at this bug but if you're running on Ventura, you can skip using osxphoto's detected text feature and just access the text that Photos has already detected. Results won't be 100% identical as Photos' uses a higher confidence level but it should be close. Change your keyword template to:

--keyword-template "{photo.search_info.detected_text?text:{photo.search_info.detected_text},}"

This accesses the search_info property of a photo to directly get the detected text from the database. It will be much faster too as the text detection won't need to take place at export time.

Thanks, this is essentially what I ended up doing in python with your library. I moved from embedding all the metadata in the export files, to exporting images with only necessary metadata and accessing Photos.app detected_text and other metadata from python. Thank you!