Open susanodd opened 2 weeks ago
@vanlummelhuizen about "reverse renaming the bak bak files". Do we just assume they are mp4 ?
@vanlummelhuizen about "reverse renaming the bak bak files". Do we just assume they are mp4 ?
There are files that are not MP4. The listing below shows the files in glossvideo
that end in .bak
and do not have the string 'MP4' in the file type.
root@signbank-new:/var/www/writable/glossvideo# find . -type f | grep -P '\.bak$' | xargs -i file {} | grep -v MP4 | less
./NGT/ON/ONE-AND-A-HALF-B-40012.bak17345.bak.bak: ISO Media, Apple iTunes Video (.M4V) Video
./NGT/ON/ONE-AND-A-HALF-B-40012.bak17346.bak: ISO Media, Apple iTunes Video (.M4V) Video
./NGT/ON/ONE-AND-A-HALF-B-40012.bak.bak.bak.bak.bak: ISO Media, Apple iTunes Video (.M4V) Video
./NGT/ON/ONE-AND-A-HALF-B-40012.bak17344.bak.bak.bak: ISO Media, Apple iTunes Video (.M4V) Video
./NGT/BL/BLIKJE-A-36667.bak.bak: ISO Media, Apple iTunes Video (.M4V) Video
./NGT/BA/BACTERIE-A-40006.bak13572.bak: ISO Media, Apple iTunes Video (.M4V) Video
./NGT/te/testlemmaidglosstranslation6-3729.bak.bak: ISO Media, Apple iTunes Video (.M4V) Video
./NGT/te/testlemmaidglosstranslation74-2793.bak.bak: ISO Media, Apple iTunes Video (.M4V) Video
./CSL_Shanghai/LA/LAUNDRY-MACHINE-A-6153.mp4.bak: ISO Media, Apple iTunes Video (.M4V) Video
However, when I search for them in the database, they don´t seem to belong to a GlossVideo object:
>>> files = [
... "glossvideo/NGT/ON/ONE-AND-A-HALF-B-40012.bak17345.bak.bak",
... "glossvideo/NGT/ON/ONE-AND-A-HALF-B-40012.bak17346.bak",
... "glossvideo/NGT/ON/ONE-AND-A-HALF-B-40012.bak.bak.bak.bak.bak",
... "glossvideo/NGT/ON/ONE-AND-A-HALF-B-40012.bak17344.bak.bak.bak",
... "glossvideo/NGT/BL/BLIKJE-A-36667.bak.bak",
... "glossvideo/NGT/BA/BACTERIE-A-40006.bak13572.bak",
... "glossvideo/NGT/te/testlemmaidglosstranslation6-3729.bak.bak",
... "glossvideo/NGT/te/testlemmaidglosstranslation74-2793.bak.bak",
... "glossvideo/CSL_Shanghai/LA/LAUNDRY-MACHINE-A-6153.mp4.bak"
... ]
>>> print(", ".join([str(GlossVideo.objects.filter(videofile=file).count()) for file in files]))
0, 0, 0, 0, 0, 0, 0, 0, 0
So, the current state is that all files in glossvideo
for which an GlossVideo object exists are MP4. But I don't think it is guaranteed that it will always be that way.
Whoa! It made some really weird file names there!
extrra bak baks after the new extension
There is video code that still uses "bak bak". But I thought it was being circumvented.
non-mp4
Okay, that is what I was afraid of. That some of the bak bak files might be totally different extensions.
I tried converting some off-line and that works. So probably a command is needed to check the format of the files and convert them if necessary.
It's possible that many of the backup files are the wrong format. That would be a normal reason for users to upload again.
I tried converting some off-line and that works. So probably a command is needed to check the format of the files and convert them if necessary.
Converting files currently in glossvideo
? As said, all files that are nog MP4 don't have a corresponding GlossVideo object, so converting is not necessary.
@susanodd Why are there two very similar command script to rename backed up glossvideo files? :
And what does https://github.com/Signbank/Global-signbank/blob/master/signbank/dictionary/management/commands/rename_non_mp4_extensions.py do?
Are they tested, reviewed? Did you already use them on the server?
@susanodd Why are there two very similar command script to rename backed up glossvideo files? :
- https://github.com/Signbank/Global-signbank/blob/master/signbank/dictionary/management/commands/rename_backup_gloss_videos.py
- https://github.com/Signbank/Global-signbank/blob/master/signbank/dictionary/management/commands/rename_backup_glossvideos.py
And what does https://github.com/Signbank/Global-signbank/blob/master/signbank/dictionary/management/commands/rename_non_mp4_extensions.py do?
Are they tested, reviewed? Did you already use them on the server?
[THIS GOT A BIT LONG]
They are tested. But only locally. We don't have video files on the development servers.
@vanlummelhuizen all of the "renamed" non-mp4 files have been converted to real mp4 files (offline, using ffmpeg).
QUESTION: What to do about the videos of DELETED glosses
The GlossVideo objects are deleted when a gloss is deleted. An entry is added to DeletedGlossOrMedia that contains the Signbank ID. But the videos are not deleted.
There is a command now (#1374) to find and delete video files that do not have GlossVideo objects.
This includes the video files of deleted glosses.
The three commands listed above that rename files will be removed after all the video files have the correct backup names and the correct format. The pull request (#1374) has modified the code that makes the reversion files, so the wrong filenames will stop being created after that is deployed. (Chicken or the egg here.)
TO DO: Convert format of non-mp4 files. Those that used to have "bak bak" sequences did not have any video extensions on them. Apparently it was assumed everything was converted using "ensure_mp4".
I've implemented a "renaming" procedure that changes the wrong format to the correct format.
The new format leaves the "mp4" in the filename. So the old format files with 'bak bak" sequences are missing the video format. I assume it is always "mp4" since we used to use "ensure_mp4" on (Signbank uploads, not API). But we don't know this before because there is still code that mentions the
(version * ".bak")
suffix. (See #1374)Incidentally, the "create poster image" does not work on videos that are NOT in "mp4", which is no longer checked because the API did not want that. So that could be why they are not being created sometimes, if the video is in the wrong format.