JPEG parameter struct mismatch: library thinks size is 656, caller expects 624

agboom commented 5 years ago

I recently installed the facerecognition app with the dependency binaries built from source. Upon running occ face:background_job, the app starts scanning, but until now all images are skipped because of the following error:

    Processing image ***
    Faces found: 0. Image will be skipped because of the following error: jpeg_loader: error while loading image: JPEG parameter struct mismatch: library thinks size is 656, caller expects 624

Searching for this issue some indicate that it could be caused by a library version mismatch of libjpeg (source), but I'm struggling to see if that's the case. All my packages are up-to-date.

For reference, apk list | grep libjpeg outputs:

libjpeg-turbo-utils-2.0.3-r0 x86_64 {libjpeg-turbo} (BSD-3-Clause IJG Zlib)
libjpeg-turbo-doc-2.0.3-r0 x86_64 {libjpeg-turbo} (BSD-3-Clause IJG Zlib)
libjpeg-8-r6 x86_64 {jpeg} (AS-IS)
libjpeg-turbo-2.0.3-r0 x86_64 {libjpeg-turbo} (BSD-3-Clause IJG Zlib) [installed]
libjpeg-turbo-dev-2.0.3-r0 x86_64 {libjpeg-turbo} (BSD-3-Clause IJG Zlib)

matiasdelellis commented 5 years ago

Hi @agboom I don't know exactly the cause, but as you say it is a mistake between libjpeg and dlib.

Please, to verify this, try using one of the python examples to make sure the error is there.

This one in particular performs the same procedure as our application:

Check this, and then try to help you compile it again.

agboom commented 5 years ago

Thank you for your suggestion @matiasdelellis, I ran the python example and it works fine. However, I installed the dlib Python library using python setup.py install and it seems to compile and include the binaries itself, so I'm not sure it's a proper representation of my dlib installation.

I also built dlib using the build parameters in this Dockerfile, hoping that it would make a difference, but sadly it did not.

Not sure where to go from here :thinking:

stalker314314 commented 5 years ago

We are logging more detailed error when debug is enabled. Can you run it with -vvv to occ face:background_job. Probably it will not say anything better (as this is native exception), but let's try.

I recently installed the facerecognition app with the dependency binaries built from source.

Can you elaborate on this? You are refering to dlib's and pdlib's sources? How did you compile dlib and pdlib, anything different than what is written in here: https://github.com/matiasdelellis/facerecognition/wiki/Installation. Also, what OS/version you are having, in case I want to repro?

agboom commented 5 years ago

Thanks, @stalker314314. Running with -vvv gives more info:

    Exception: jpeg_loader: error while loading image: JPEG parameter struct mismatch: library thinks size is 656, caller expects 624 in /var/www/html/apps/facerecognition/lib/BackgroundJob/Tasks/ImageProcessingTask.php:224
Stack trace:
#0 /var/www/html/apps/facerecognition/lib/BackgroundJob/Tasks/ImageProcessingTask.php(224): CnnFaceDetection->detect('/tmp/oc_tmp_Egd...')
#1 /var/www/html/apps/facerecognition/lib/BackgroundJob/Tasks/ImageProcessingTask.php(167): OCA\FaceRecognition\BackgroundJob\Tasks\ImageProcessingTask->findFaces(Object(CnnFaceDetection), '/var/www/html/d...', Object(OCA\FaceRecognition\Db\Image))
#2 /var/www/html/apps/facerecognition/lib/BackgroundJob/BackgroundService.php(120): OCA\FaceRecognition\BackgroundJob\Tasks\ImageProcessingTask->execute(Object(OCA\FaceRecognition\BackgroundJob\FaceRecognitionContext))
#3 /var/www/html/apps/facerecognition/lib/Command/BackgroundCommand.php(138): OCA\FaceRecognition\BackgroundJob\BackgroundService->execute(0, true, NULL, NULL)
#4 /var/www/html/3rdparty/symfony/console/Command/Command.php(255): OCA\FaceRecognition\Command\BackgroundCommand->execute(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#5 /var/www/html/3rdparty/symfony/console/Application.php(901): Symfony\Component\Console\Command\Command->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#6 /var/www/html/3rdparty/symfony/console/Application.php(262): Symfony\Component\Console\Application->doRunCommand(Object(OCA\FaceRecognition\Command\BackgroundCommand), Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#7 /var/www/html/3rdparty/symfony/console/Application.php(145): Symfony\Component\Console\Application->doRun(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#8 /var/www/html/lib/private/Console/Application.php(213): Symfony\Component\Console\Application->run(Object(Symfony\Component\Console\Input\ArgvInput), Object(Symfony\Component\Console\Output\ConsoleOutput))
#9 /var/www/html/console.php(97): OC\Console\Application->run()
#10 /var/www/html/occ(11): require_once('/var/www/html/c...')
#11 {main}
yielding
    Processing image ***

I'm indeed referring to dlib's and pdlib's sources. I compiled them in Alpine using the instructions in the Wiki page you linked, except the names of the Alpine packages are a bit different (see below).

The Nextcloud instance is running in a Docker container installed, so I'm building in Docker environment.

Until now I tried various ways of building the libraries:

In a separate container based on Alpine 3.10, so that I would not pollute the Nextcloud container with build packages. After building I copied the libraries over to the Nextcloud container. I used the build instructions from the Wiki.
In the Nextcloud container, with the same build instructions. I thought that the cause of the libjpeg error would lie in a version mismatch between the two containers. The Nextcloud container also runs Alpine 3.10.
With different build parameters based on this Dockerfile. Since it includes the preprocessor directive DLIB_JPEG_SUPPORT I thought it might make a difference. I tried binaries from both the separate container and the Nextcloud container, but they both yielded the same error as before.

The Dockerfile to build dlib and pdlib is as follows:

FROM alpine

RUN apk update && apk add bash vim git

# DLib https://github.com/goodspb/pdlib#dependencies
RUN apk add cmake make gcc libc-dev g++ openblas-dev libx11-dev pkgconf

RUN git clone https://github.com/davisking/dlib.git \
  ; cd dlib/dlib \
  ; mkdir build \
  ; cd build \
  ; cmake -DBUILD_SHARED_LIBS=ON .. \
  ; make \
  ; make install

# https://github.com/goodspb/pdlib#installation
RUN apk add php7-dev

ENV PKG_CONFIG_PATH /usr/local/lib64/pkgconfig/

RUN git clone https://github.com/goodspb/pdlib.git \
  ; cd pdlib \
  ; phpize \
  ; ./configure \
  ; make \
  ; make install

matiasdelellis commented 5 years ago

Hi both, It is definitely a dlib error. Do not look elsewhere. :sweat_smile: This dockerfile seems correct, except you didn't add the libjpeg dependency.

  ; cmake -DBUILD_SHARED_LIBS=ON .. \

Look carefully at the log of this command to know if you are using libjpeg, version, etc.. If you want share it.

agboom commented 5 years ago

@matiasdelellis It says -- Found system copy of libjpeg: /usr/lib/libjpeg.so.

Which is right, because libjpeg is installed via the package manager using apk add jpeg jpeg-dev.

This is (part of) the log of apk

(34/63) Installing libjpeg-turbo (2.0.3-r0)
(35/63) Installing libjpeg-turbo-utils (2.0.3-r0)
(36/63) Installing jpeg (8-r6)
(37/63) Installing pkgconf (1.6.1-r1)
(38/63) Installing libjpeg-turbo-dev (2.0.3-r0)
(39/63) Installing jpeg-dev (8-r6)

Could it be that apk installs the wrong version for dlib?

matiasdelellis commented 5 years ago

I'm compiling your dockerfile, it seems that it doesn't report anything, but when you add the packages "jpeg jpeg-dev libpng libpng-dev" print this:

-- Found system copy of libpng: /usr/lib/libpng.so;/lib/libz.so
-- Found system copy of libjpeg: /usr/lib/libjpeg.so

Therefore, i guess that without the libraries dlib compiling with an internal copy, or directly discard jpeg support. I do not know. For now I think it is better to add the dependencies.

matiasdelellis commented 5 years ago

Which is right, because libjpeg is installed via the package manager using apk add jpeg jpeg-dev.

Well, The dockerfile you shared does not include these lines. :sweat_smile:

Could it be that apk installs the wrong version for dlib?

There are few changes but you can try to compile the stable version instead of git

matiasdelellis commented 5 years ago

p.s: https://github.com/davisking/dlib/issues/1913 It is not the same error, but it is probably the same cause.

matiasdelellis commented 5 years ago

Well .. It seems that I have a dokerfile that seems to work.. But there are no major modifications over which you shared... Please @agboom , see if you can extract something useful.

[matias@nube dlib]$ cat Dockerfile 
FROM alpine

RUN apk update && apk add bash vim git

# DLib https://github.com/goodspb/pdlib#dependencies
RUN apk add cmake make gcc libc-dev g++ unzip openblas-dev libx11-dev pkgconf jpeg jpeg-dev libpng libpng-dev

RUN git clone --branch v19.18 https://github.com/davisking/dlib.git \
  ; cd dlib/dlib \
  ; mkdir build \
  ; cd build \
  ; cmake -DBUILD_SHARED_LIBS=ON --config Release .. \
  ; make \
  ; make install

# https://github.com/goodspb/pdlib#installation
RUN apk add php7-dev php7-gd

ENV PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib64/pkgconfig/

RUN pkg-config --libs --cflags libjpeg \
  ; pkg-config --libs --cflags dlib-1

RUN git clone https://github.com/goodspb/pdlib.git \
  ; cd pdlib \
  ; phpize \
  ; ./configure \
  ; make \
  ; make install

# Add pdlib to php
RUN echo "extension=pdlib.so" >> `php --ini | grep "Loaded Configuration" | sed -e "s|.*:\s*||"`

# Download test and clean it
RUN wget https://github.com/matiasdelellis/facerecognition/files/3107912/crop-tests.zip \
  ; unzip -q crop-tests.zip \
  ; rm crop-tests/cropped_* \
  ; rm crop-tests/test.csv

# download models needed to test and execute it..
RUN cd crop-tests ; mkdir -p vendor/models/1/ \
  ; wget https://github.com/davisking/dlib-models/raw/94cdb1e40b1c29c0bfcaf7355614bfe6da19460e/mmod_human_face_detector.dat.bz2 -O vendor/models/1/mmod_human_face_detector.dat.bz2 \
  ; bzip2 -d vendor/models/1/mmod_human_face_detector.dat.bz2 \
  ; wget https://github.com/davisking/dlib-models/raw/2a61575dd45d818271c085ff8cd747613a48f20d/dlib_face_recognition_resnet_model_v1.dat.bz2 -O vendor/models/1/dlib_face_recognition_resnet_model_v1.dat.bz2 \
  ; bzip2 -d vendor/models/1/dlib_face_recognition_resnet_model_v1.dat.bz2 \
  ; wget https://github.com/davisking/dlib-models/raw/4af9b776281dd7d6e2e30d4a2d40458b1e254e40/shape_predictor_5_face_landmarks.dat.bz2 -O vendor/models/1/shape_predictor_5_face_landmarks.dat.bz2 \
  ; bzip2 -d vendor/models/1/shape_predictor_5_face_landmarks.dat.bz2 \
  ; php test.php

This runs a test that was originally for issue https://github.com/matiasdelellis/facerecognition/issues/140 and it works correctly

We will calculate the descriptors with default pdlib method... Done.
Now we will calculate the descriptors in the same way that NC Facerecognition does with diferent margins.
left; top; right; bottom; percent; distance
207; 127; 376; 297; 0; 0.13652059092093
207; 127; 376; 297; 1; 0.13652059092093
206; 126; 377; 298; 2; 0.11186140743847
205; 125; 378; 299; 3; 0.10049401857476
204; 124; 379; 300; 4; 0.11421536336674
203; 123; 380; 301; 5; 0.11855536425593
202; 122; 381; 302; 6; 0.11815553728961
202; 122; 381; 302; 7; 0.11815553728961
201; 121; 382; 303; 8; 0.10604281327289
200; 120; 383; 304; 9; 0.10391523245392
199; 119; 384; 305; 10; 0.12912076867759
198; 118; 385; 306; 11; 0.079678820377295
197; 117; 386; 307; 12; 0.084377919761384
197; 116; 386; 308; 13; 0.069171079103895
196; 116; 387; 308; 14; 0.067778617055956
195; 115; 388; 309; 15; 0.074639437307429
194; 114; 389; 310; 16; 0.093405272402453
193; 113; 390; 311; 17; 0.083427698715018
192; 112; 391; 312; 18; 0.060695344166223
191; 111; 392; 313; 19; 0.062310582001867

agboom commented 5 years ago

Thanks for the elaborate follow up @matiasdelellis!

Well, The dockerfile you shared does not include these lines.

You're absolutely right, I made a mistake here, mixing up two different Dockerfiles. The observation is still right, but I failed to say that the output in that comment (where libjpeg is found as system copy) was a result of building another Dockerfile where jpeg and jpeg-dev were indeed installed. Sorry for the mixup :sweat_smile:

I'm now trying your Dockerfile to see if it makes a difference. Will tune back with the results :crossed_fingers:!

agboom commented 5 years ago

Your Dockerfile works, so that's great! The tests you included run fine.

A little out of scope for this issue, but is it expected behavior that the occ face:background_job exits after processing one image? The logs do not show anything suspicious, but I'd expect it to continue to the next image.

1/10 - Executing task CheckRequirementsTask (Check all requirements)
    Found 4096 MB available to PHP.
2/10 - Executing task CheckCronTask (Check that service is started from either cron or from command)
3/10 - Executing task LockTask (Acquire lock so that only one background task can run)
4/10 - Executing task DisabledUserRemovalTask (Purge all the information of a user when disable the analysis.)
yielding
yielding
yielding
5/10 - Executing task StaleImagesRemovalTask (Crawl for stale images (either missing in filesystem or under .nomedia) and remove them from DB)
    Skipping stale images removal for user ** as there is no need for it
6/10 - Executing task CreateClustersTask (Create new persons or update existing persons)
    Skipping cluster creation, not enough data (yet) collected. For cluster creation, you need either one of the following:
    * have 1000 faces already processed (you have 0),
    * have 100 images (you have 0),
    * or you need to have 95% of you images processed (you have 0.00%)
yielding
    Skipping cluster creation, not enough data (yet) collected. For cluster creation, you need either one of the following:
    * have 1000 faces already processed (you have 0),
    * have 100 images (you have 0),
    * or you need to have 95% of you images processed (you have 0.00%)
yielding
    0 faces found for clustering
    0 persons found after clustering
yielding
7/10 - Executing task AddMissingImagesTask (Crawl for missing images for each user and insert them in DB)
    Skipping image scan for user shanna that has disabled the analysis
    Skipping image scan for user admin that has disabled the analysis
    Skipping full image scan for user **
8/10 - Executing task EnumerateImagesMissingFacesTask (Find all images which don't have faces generated for them)
yielding
9/10 - Executing task ImageProcessingTask (Process all images to extract faces)
    NOTE: Starting face recognition. If you experience random crashes after this point, please look FAQ at https://github.com/matiasdelellis/facerecognition/wiki/FAQ
yielding
    Processing image **
    Image scaled from 768x1280 to 1374x2290 (since max image area is 3145728 pixels^2)

agboom commented 5 years ago

As an addition, I've created a Dockerfile with multi-stage build for building dlib, pdlib and the facerecognition app. This may be useful for fellow Docker users, so if you like we could add this to the documentation of this app for easier onboarding?

This is all largely out of scope for this issue, so I'm willing to create a new issue or even a PR if you'd like.

The Dockerfile:

FROM alpine AS builder

# DLib https://github.com/goodspb/pdlib#dependencies
RUN apk add cmake make gcc libc-dev g++ unzip openblas-dev libx11-dev pkgconf jpeg jpeg-dev libpng libpng-dev

ARG DLIB_BRANCH=v19.18

RUN wget -c -q https://github.com/davisking/dlib/archive/${DLIB_BRANCH}.tar.gz \
  && tar xf ${DLIB_BRANCH}.tar.gz \
  && mv dlib-* dlib \
  && cd dlib/dlib \
  && mkdir build \
  && cd build \
  && cmake -DBUILD_SHARED_LIBS=ON --config Release .. \
  && make \
  && make install

# https://github.com/goodspb/pdlib#installation
RUN apk add php7-dev php7-gd

ENV PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib64/pkgconfig/

RUN pkg-config --libs --cflags libjpeg \
  && pkg-config --libs --cflags dlib-1

ARG PDLIB_BRANCH=master

RUN wget -c -q https://github.com/goodspb/pdlib/archive/$PDLIB_BRANCH.zip \
  && unzip $PDLIB_BRANCH \
  && mv pdlib-* pdlib \
  && cd pdlib \
  && phpize \
  && ./configure \
  && make \
  && make install

RUN apk add npm bash make composer curl wget php7-dom php7-tokenizer php7-xmlwriter php7-xml

ARG FR_BRANCH=master

ADD busybox.patch .

RUN wget -c -q -O facerecognition https://github.com/matiasdelellis/facerecognition/archive/$FR_BRANCH.zip \
  && unzip facerecognition \
  && mv facerecognition-* fr \
  && cd fr \
  && patch -uN < ../busybox.patch \
  && make

ARG NC_VERSION=17

FROM nextcloud:$NC_VERSION-fpm-alpine

RUN apk add jpeg libpng php7-gd openblas

COPY --from=builder /usr/local /usr/local
COPY --from=builder /usr/lib/php7/modules/pdlib.so /usr/local/lib/php/extensions/no-debug-non-zts-20180731/
COPY --from=builder fr /var/www/html/apps/facerecognition

RUN echo "extension=pdlib.so" > /usr/local/etc/php/conf.d/pdlib.ini

NOTE: because Alpine contains a different version of wget (busybox instead of GNU), I had to apply a small patch to get the Makefile for facerecognition to work:

diff --git a/Makefile b/Makefile
index 766c8a0..416959b 100644
--- a/Makefile
+++ b/Makefile
@@ -22,10 +22,10 @@ default: build
 test-bin-deps:
    @echo "Checking binaries needed to build the application"
    @echo "Testing npm, curl, wget and bzip2. If one is missing, install it with the tools of your system."
-   npm -v
-   curl -V
-   wget -V
-#  bzip2 -V # FIXME: bzip2 always return an error.
+   which npm
+   which curl
+   which wget
+   which bzip2

 composer:
 ifeq (,$(composer))

matiasdelellis commented 5 years ago

Great .. :grimacing:

The logs do not show anything suspicious, but I'd expect it to continue to the next image.

You enabled analysis for the user?

https://github.com/matiasdelellis/facerecognition#how-to-use-it (Point one :wink: )
or: sudo -u apache php occ user:setting user facerecognition enabled true :wink:

If this is, maybe we could add a better message saying it. :disappointed:

Some notes about your dockerfile..

php7-gd

Add it due the test that depends on it, but Nextcloud already has it as a dependency. So, you don't need to add it there ..

RUN pkg-config --libs --cflags libjpeg \
  && pkg-config --libs --cflags dlib-1

This is completely unnecessary. I put it to check that there are no double libraries .. :sweat_smile:

ENV PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib64/pkgconfig/

It is probably unnecessary, but it is more correct to expand the environment variables than replace them.. :sweat_smile: I'm not even sure it works well .. Would have to print the variable to know how to stay ... :confused:

About the patch, then I think about it.. Thanks.. :grimacing:

agboom commented 5 years ago

You enabled analysis for the user?

Good point, I did, but via the web interface. occ user:settings <user> gives the following output:

  - facerecognition:
    - recreate-clusters: false
    - force-create-clusters: false
    - enabled: true
    - full_image_scan_done: true

Thanks for your feedback on the Dockerfile, I'll work on a proposal for a wiki page that includes the Dockerfile and post it later.

agboom commented 5 years ago

Looking at the admin settings for the app, it says that the analysis has not started yet. Is there a way to dig up some more debug info maybe?

agboom commented 5 years ago

Sorry for hijacking this issue again. Since this is clearly a different issue, I have opened a new one and will close this: https://github.com/matiasdelellis/facerecognition/issues/176

matiasdelellis / facerecognition

JPEG parameter struct mismatch: library thinks size is 656, caller expects 624 #175