Signbank / Global-signbank

An online sign dictionary and sign database management system for research purposes. Developed originally by Steve Cassidy/ This repo is a fork for the Dutch version, previously called 'NGT-Signbank'.
http://signbank.cls.ru.nl
BSD 3-Clause "New" or "Revised" License
19 stars 12 forks source link

Creating gloss images is broken #734

Open ocrasborn opened 3 years ago

ocrasborn commented 3 years ago

Both the automatic generation of gloss images, and the button for "Create Citation Form Image from Current Video" (in the edit mode) seem to be broken. (For the new dataset IS_WFD2007, I saw that on the server, there was not yet a folder under glossimage/. Creating that folder didn't solve the issue. However, it doesn't work for dataset ISL either, and for that dataset the folder had already been generated.)

susanodd commented 3 years ago

On my local server this has never worked. It's something to do with the video resizing routines.

susanodd commented 3 years ago

There might be something wrong with the file system permissions. There is an error from extractMiddleFrame -> MiddleFrameExtracter. The two folders it makes, filename-frames/all and filename-frames/middle, are empty. It doesn't find the expected image file in folder middle. @vanlummelhuizen ?

susanodd commented 3 years ago

I found the line of code that is causing the problem:

"-filter:v", scale_formula,

This is inside the extract_frames function inside of extractMiddleFrame.py : MiddleFrameExtracter

I did this on my local copy of the code (CGNT_scripts). Aside from lots of print statements, that's the only code I acrually changed. (I merely commented it out.)

To debug, I switched the ffmpeg command to "-v verbose" to see what was going on. The filter above is not recognised.

@vanlummelhuizen can you update this in the CGNT_scripts ?

susanodd commented 3 years ago

The CNGT_scripts are managed by @vanlummelhuizen

Woseseltops commented 3 years ago

Okay this was an interesting puzzle, but I think I figured it out. There are two separate problems:

Happy @ocrasborn ?

ocrasborn commented 2 years ago

Happy!

susanodd commented 1 year ago

This is broken.

vanlummelhuizen commented 1 year ago

This is broken.

Any error?

Perhaps this is solved by me installing imagemagick/convert. See https://github.com/Signbank/Global-signbank/issues/911#issuecomment-1497032914.

susanodd commented 1 year ago

Yes, this works again!

susanodd commented 1 year ago

For this gloss it is does not work to create an image. (Create Citation Form Image from Current Video)

https://signbank.cls.ru.nl/dictionary/gloss/4438

The videos for glosses starting with a hash tag (#) were not showing up in the Gloss List. I revised that code so they show now. But for this one, the image is wrong and it seems stuck.

vanlummelhuizen commented 1 year ago

This was caused by parsing the video file path as a url in CNGT-scripts. Everything with a # was affected because in an url everything after a # is interpreted a fragment identifier, not as part of the path.

I removed it: https://github.com/vanlummelhuizen/CNGT-scripts/commit/eb4c9fceb2e34aa7e11133cfe9674d4df865b77c. Could you try again to see that this works now @susanodd ?

susanodd commented 1 year ago

Oh, so I need to actually deploy so it installs the revised code.

susanodd commented 1 year ago

Just a sec.

susanodd commented 1 year ago

Nope, didn't work.

From the wsgi log:

Video file: /var/www/writable/glossvideo/NGT/#G/#G-B-4438.mp4
ffmpeg -v quiet -i /var/www/writable/glossvideo/NGT/#G/#G-B-4438.mp4 -filter:v scale='iw*max(1,sar)':'ih*max(1,1/sar)' /var/www/writable/tmp/signbank-ExtractMiddleFrame/#G-B-4438.mp4-frames/all/frame-%5d.png
convert /var/www/writable/tmp/signbank-ExtractMiddleFrame/#G-B-4438.mp4-frames/middle/#G-B-4438.png -resize x180 /var/www/writable/tmp/signbank-ExtractMiddleFrame/#G-B-4438.mp4-frames/middle/#G-B-4438_320x180.png
IOError:  [Errno 1] Operation not permitted: '/var/www/writable/glossimage/NGT/#G/#G-B-4438.png'
[pid: 409|app: 0|req: 33/128] 10.65.167.47 () {74 vars in 1877 bytes} [Fri Aug 18 12:31:04 2023] GET /dictionary/createcitationimage/4438 => generated 0 bytes in 380 msecs (HTTP/1.1 302) 6 headers in 218 bytes (1 switches on core 2)

Here's the directory:

../writable/glossimage/NGT/#G:
total 224
-rwxrwxr-x 1 root wwwsignbank  78676 Mar 30 14:26 '#G-3814.png'
-rwxrwxr-x 1 root wwwsignbank 164699 Aug 18 10:31 '#G-B-4438.png'

There is something else going on. I cd'd from the container to the writable (as shown above) but the actual ls and pwd in that directory is:

signbank-ansible
/root

That was when I first when to NGT then went to #G. Now I did it as one path cd ...

cd /var/www/writable/glossimage/NGT/#G

Then pwd yeilds: /var/www/writable/glossimage/NGT/#G

But in the "pathname prompt" it has an additional # after the path.

(env) root@signbank-new:/var/www/writable/glossimage/NGT/#G#

susanodd commented 1 year ago

See comments above. I have the impression the hash tag is being interpreted.

susanodd commented 1 year ago
(env) root@signbank-new:/var/www/writable/glossimage/NGT/#G# ls -l
total 224
-rwxrwxr-x 1 root wwwsignbank  78676 Mar 30 14:26 '#G-3814.png'
-rwxrwxr-x 1 root wwwsignbank 164699 Aug 18 10:31 '#G-B-4438.png'

I don't see why it can't write over that file.

susanodd commented 1 year ago

It looks like it fixed itself. Now the new image is showing.

susanodd commented 1 year ago

Super cool!

vanlummelhuizen commented 1 year ago

@susanodd I fixed it. The only thing was that your had to refresh your page, I guess.

susanodd commented 1 month ago

This is broken.

vanlummelhuizen commented 1 month ago

I just tested uploading a video to https://signbank.cls.ru.nl/dictionary/gloss/36430/ and it works as expected. What problems do you encounter exactly?

susanodd commented 1 month ago

It basically doesn't work on Safari, on an Apple computer. (Using PyCharm)

It worked initially, then the file disappeared.

vanlummelhuizen commented 1 month ago

(Using PyCharm)

Is this about your local Signbank then? Because you are not very specific, I cannot help you.

susanodd commented 1 month ago

(Using PyCharm)

Is this about your local Signbank then? Because you are not very specific, I cannot help you.

Yes, it also happens on Ubuntu PyCharm.

What about this ?

USE_X_SENDFILE

in protected_media

That's the main difference with Apache?

[BABBLE] It looks like that is where I need to check what it's doing with the filenames. Or catch if it doesn't find the proper name. Should protected_media only be in the template paths if the video/image already exists? And must exists on the file system?

In other issues, sometimes the os... exists did not work correctly. PyCharm flags things where paths are being created, that things have the wrong type. Like IntegerField isn't an "int". Or CharField isn't a "str". PyCharm wants you to pay for a professional version, so I suspect it complains about stuff so it can suggest to upgrade.

susanodd commented 1 month ago

Here is the gloss model method that is being called:

    def create_citation_image(self):
        from signbank.video.models import GlossVideo
        print('create citation')
        glossvideo = GlossVideo.objects.get(gloss=self, version=0)
        print('after getting')
        print(glossvideo.videofile.__dict__)
        print(glossvideo.__dict__)
        glossvideo.make_poster_image()

This does not create the image. This is what is being displayed in the PyCharm log:

create citation
Citation image for gloss 36430 could not be created.
[26/Sep/2024 08:03:21] "GET /dictionary/createcitationimage/36430 HTTP/1.1" 302 0

It seems to be skipping three print statements. The "could not be created" is from the last call to make_poster_image How is this possible? Is it somehow executing a previous call ? Are they put in a queue or something? It's also weird that the GET url appears after the error message.

This is PyCharm doing this. It does not create an image and it also does not print anything.

susanodd commented 1 month ago

I revised the call some to filter instead of get, to force it not to get subtypes. Now it actually executes the print statements. Was it just "not bothering" to execute the code? I don't know why the code above didn't print anything.

    def create_citation_image(self):
        from signbank.video.models import GlossVideo
        print('create citation')
        glossvideos = GlossVideo.objects.filter(gloss=self, glossvideonme=None, glossvideoperspective=None, version=0)
        print('after getting: ', glossvideos)
        if not glossvideos:
            print('no gloss video')
            return
        glossvideo = glossvideos.first()
        print(glossvideo.videofile.__dict__)
        print(glossvideo.__dict__)
        glossvideo.make_poster_image()
create citation
after getting:  <QuerySet [<GlossVideo: glossvideo/tstMH/PE/PERSPECTIEF-36430.mp4>]>
{'_file': None, 'name': 'glossvideo/tstMH/PE/PERSPECTIEF-36430.mp4', 'instance': <GlossVideo: glossvideo/tstMH/PE/PERSPECTIEF-36430.mp4>, 'field': <django.db.models.fields.files.FileField: videofile>, 'storage': <signbank.video.models.GlossVideoStorage object at 0x111145820>, '_committed': True}
{'upload_to': <function get_video_file_path at 0x1110daca0>, '_state': <django.db.models.base.ModelState object at 0x11231d010>, 'id': 22139, 'videofile': <FieldFile: glossvideo/tstMH/PE/PERSPECTIEF-36430.mp4>, 'gloss_id': 36430, 'version': 0}
Video file: /Users/susaneven/Documents/writable/glossvideo/tstMH/PE/PERSPECTIEF-36430.mp4
/usr/local/bin/ffmpeg -v quiet -i /Users/susaneven/Documents/writable/glossvideo/tstMH/PE/PERSPECTIEF-36430.mp4 -filter:v scale='iw*max(1,sar)':'ih*max(1,1/sar)' /Users/susaneven/Documents/writable/tmp/signbank-ExtractMiddleFrame/PERSPECTIEF-36430.mp4-frames/all/frame-%5d.png
convert /Users/susaneven/Documents/writable/tmp/signbank-ExtractMiddleFrame/PERSPECTIEF-36430.mp4-frames/middle/PERSPECTIEF-36430.png -resize x180 /Users/susaneven/Documents/writable/tmp/signbank-ExtractMiddleFrame/PERSPECTIEF-36430.mp4-frames/middle/PERSPECTIEF-36430_320x180.png
Generating still images succes!

In the above, you can see a field '_commited': True

Should we be looking at that if the files aren't "finished being uploaded" ?

I am trying to figure out why previously uploaded videos disappear. This issue happened as a side effect.

Comments and suggestions are welcome! It worked now. But I don't understand why it wasn't printing anything in the first one. It skipped a bunch of code.

susanodd commented 1 month ago

This is for @vanlummelhuizen ! See above two comments with specific details. Thanks.

vanlummelhuizen commented 1 month ago

The reason you don't see some of your print statements is that Gloss.create_citation_image is called from a try-except clause in the view:

https://github.com/Signbank/Global-signbank/blob/6242cc47de593c3debcd78da7a2f2a25f97f0f62/signbank/dictionary/views.py#L1880-L1892

Apparently glossvideo = GlossVideo.objects.get(gloss=self, version=0) fails and it goes to the accept clause. Unfortunately, this except clause does not output the exact error/exception.

Your change of glossvideo = GlossVideo.objects.get(gloss=self, version=0) to glossvideos = GlossVideo.objects.filter(gloss=self, glossvideonme=None, glossvideoperspective=None, version=0) does not fail because it can deal with an empty query result.

I think somehow there is no video to extract an image from. Why this happens, I don't know. Again, I don't think this is a problem on the live server right now.


It's also weird that the GET url appears after the error message.

No. Django outputs a log message of a request after it is done processing it.


Tip: use the debug functionality of PyCharm. It shows you a stack(trace) so that you can see what function/method is calling what function/method. That may help find out problems like these.