akai10tsuki / mkvbatchmultiplex

Batch multiplex video files using MKVToolnix generated command line
MIT License
50 stars 3 forks source link

[Bug/Performance] Performance Drop #10

Closed VegethB closed 5 months ago

VegethB commented 3 years ago

I was hoping not ... Apparently the new file detection system is a bit too slow. For example: With previous versions, I entered the command for a batch of 25 ep with a total of 55gb and its corresponding HEVC, always 25 ep, (used to decrease the total size). In 3/5 seconds came the message that the command was OK. At that point I checked and in a short time (no more than 2-3 minutes) the check finished.

Now, just to paste the command and wait for the OK from the program, it also takes 10 15 minutes. I'm not talking about the check ... (for a batch like the one described above I left the pc all night and in the morning it was still doing the last 2 episodes).

If speed couldn't be improved, wouldn't it be possible to reintroduce the old system together with this new one? So you can choose (after all the BOM problem is mostly about files NOT already inside MKV). I didn't write this issue right away because I wanted to understand if it was due to these 50gb batches (2 - 3gb for single file) or if it was in general. In the end it is both for the file size and in general. Thanks in advance

OS: Windows 10 x64 (on SSD 500gb Kingstone) 10 GB Ram I5-4460 HDD Sata 3 2.5" (internal)

Edit1:

Mainly on files of this type: MKV VIDEO: x264 (from 1.5 to 3gb) AUDIO1: FLAC 2.0 AUDIO2: FLAC 5.1 AUDIO3 DTS-HD_MA 5.1 Sub1: ASS Sub2: ASS Sub3: HDMV PGS Sub4: HDMV PGS Chapter

Example this torrent: https://nyaa.si/view/1256326

akai10tsuki commented 3 years ago

Hi,

Sorry for the delayed response. I have been very busy. I will start looking on this problem soon. I have been using a SSD drive with small files for testing and looking for structure errors in the files. This helps me in the development.

The Algorithm 0 is 'almost' basically the original system working. But is this happened when trying to correct the BOM issue it can bee something else that I did. Going back will not be easy but I stop what I was doing and work on this performance issue now. I there is another torrent that demonstrate the issue it may be useful to work with more than one.

One question it seams you are using the same drive for the source and the destination is that correct?

I want to make sure because whenever I have batch jobs that large I tend to use more than one drive. Usually external drives connected via USB 3 their throughput is higher than sata drives. But I will be doing all the testing with sata drives using one and two drives.

VegethB commented 3 years ago

In this issue, specifically, it's really slow to batch that torrent. The problem I am having is more than anything else as soon as you paste the command ... the program also crashes for 15 seconds.

One question it seams you are using the same drive for the source and the destination is that correct? No, Usually these are two situations: 1. Starting file ([Judas] - series name - s01e01 - random title.mkv) on disk X; File with FLAC audio to put on Judas (bla bla bla) on disk X; Files with subs of other languages on Network Drive K; Destination file (the mux of the 3 files) on the network disk G;

2. Starting file ([Judas] - series name - s01e01 - random title.mkv) on network disk G; File with FLAC audio to put on Judas (bla bla bla) on disk X; Files with subs of other languages on the X network disk; Destination file (the mux of the 3 files) on the network disk G (in another folder than the starting file);

EDIT1: @akai10tsuki

Video Ex: https://mega.nz/file/ctkHWaha#TVEV6Eqd6I8y9Mlc1TXpB90ADx0XMV1ez0SxTo4oH-c

pc spec CPU: Intel I5 4460 RAM: 10GB DDR3 1333 HDD's all SATA3 OS: Windows 10 x64 on SSD 2.5 1GB Ethernet

Command: C:\Users\MSI\Downloads\EXE\mkvtoolnix\mkvmerge.exe --ui-language it --output ^"Y:\FanSeries\Blue Exorcist\Stagione 01\v\Ao no Exorcist - s01e01 - Erai-raws [Web] - .mkv^" --no-audio --no-video --sub-charset 2:UTF-8 --language 2:it --track-name ^"2:Netflix Ita^" --default-track 2:yes --sub-charset 3:UTF-8 --language 3:en --track-name ^"3:Netflix Eng^" ^"^(^" ^"Y:\FanSeries\Blue Exorcist\Stagione 01\Ao no Exorcist - s01e01 - Erai-raws [Web] - .mkv^" ^"^)^" --audio-tracks 2 --language 0:en --track-name ^"0:[Prof] x265 10bit BD^" --display-dimensions 0:1920x1080 --language 2:ja --track-name ^"2:AAC 2.0 Jpn^" --default-track 2:yes --language 3:en --track-name ^"3:Signs ^& Songs^" --language 4:en --track-name 4:Full ^"^(^" ^"G:\_torrent\_HEVC-Conv\Blue Exorcist ^(2011^) S01 [1080p x265 HEVC 10bit BluRay Dual Audio AAC] [Prof]\[Prof] S01E01 - The Devil Resides in Human Souls.mkv^" ^"^)^" --track-order 1:0,1:2,0:2,0:3,1:4,1:3

In this case (since I have no series to change😆) I changed the video and audio (Web) track with the BD versions. The starting file is on the network drive; The files with the new tracks on my HDD (sata); The new files will be written one folder ahead of that of the starting files;

The network is all wired 1gb (pc and router both at 1gb). This performance issue did not exist before the BOM fix.

Usually (when I add new series) both the starting file and the file with the new video and audio tracks are on the same HDD as my PC (sata). The subs on a network drive and the destination folder on the final network drive. Sometimes, the subs are also on the same HDD (so the 3 files to move on my HDD and the destination folder on the network drive). Even if you change the destination folder on the same HDD (where the source files are also), the problem still exists.

akai10tsuki commented 3 years ago

Hi,

I started analyzing this and this is what is going:

Before:

I made a more or less light analysis of the command line. Just that it was a sound construction. No checking of the files whatsoever. Then my testing crew really a few friends with good size media servers started to use the program more then a lot of times the problems that they reported had to do:

That is why my goal was is you submitted a job you can see what to expect for the results before hand. The are not that with computers. This was before letting the project public.

Now:

The check is more through all almost final with what I call Algorithms the more common issues now are solve before each individual merge is submitted. Also I'm working on the base to be able to stop the job and resume. The structure are more high level programming wise are more easy to use but internally are quite complex.

I will try to go back so the check when the command is paste be as light as possible and when the debug buttons (,...) are use then the full analysis is made.

What you are experiencing is the GUI been block because of heavy IO on the background I am using Python and QT for Python and with QT is has been tricky to maintain the GUI responsive 100% of the time. Also on windows Python only permit to run one thread at a time (you can look for the Python GIT) this makes it more interesting working with GUIs. The why Python at all, this project is me learning Python.

I will go back to check in more granular manner. And work to make the GUI more responsive giving more feedback on what is going on. And when mkvmerge is working there is nothing more that I can do.

So now I need opinions

  1. I can do no check on the files at all when pasting the command. (Jobs may not run)
  2. Check up to the numbers of files are correct. (Jobs more likely to run)
  3. From say 25 check the first N (ex. 3) files.
  4. Give options to select 1, 2 or 3
  5. ???

Any other ideas welcome.

When version 0.xxx was out I was doing only number 2. And since Algorithm 1 was stable there is no feedback from the original beta testers only 2 of them said they will not be using any new versions.

VegethB commented 3 years ago

What you are experiencing is the GUI been block because of heavy IO on the background I am using Python and QT for Python and with QT is has been tricky to maintain the GUI responsive 100% of the time.

I wasn't referring to the block itself (it's something I always see, when the program starts working, everything crashes and you have to wait for it to finish). The problem is that it even freezes for 15 minutes in a row (thus seeming more on 'crashed' than on 'is working') when the previous version did not have this problem at all (if it had been there since the beginning, don't even open the issue) .

But if you tell me that this happens because now the analysis function has been fully implemented (while before it was incomplete and therefore faster) I am satisfied as it is now.

And when mkvmerge is working there is nothing more that I can do.

obviously.

So, the problem I encountered is:

Before I would paste the command and it was instantly "analyzed" and I was therefore free to use the GUI;

Now When I paste the command, it goes from 10s to even 6 minutes or more. it depends more on the size of the files and their content rather than the number... example, I did the batch of JoJo BD of Judas that season 02 are 48 episodes and the program was to analyze the command in 5 15 seconds maximum.

So, the reason for my issue is not related to this smallness (jojo) but to those batches with series of 24 episodes in 60gb in size. In my case, what gave me serious problems was that torrent and this (https://nyaa.si/view/1223902) that I was in front of the program (which seemed to crash) for even 10 minutes before it told me " OK command valid ".

So now I need opinions

  1. I can do no check on the files at all when pasting the command. (Jobs may not run)
  2. Check up to the numbers of files are correct. (Jobs more likely to run)
  3. From say 25 check the first N (ex. 3) files.
  4. Give options to select 1, 2 or 3
  5. ???

Any other ideas welcome.

  1. ...Difficult to give an answer, but I would say to do it like this: Each time you paste a command, you are asked if you want to do an analysis (warning that if not executed, the job will most likely fail). Put an overide in the config that if you want it always does an analysis or not (because I realize that without an analysis it is more difficult to understand why it doesn't work);

  2. Actually he already does this job ...? As in the sample video I sent, if it doesn't find the same file number, it won't even start. What would be handy, is to be able to start it anyway (sometimes it happens that a certain fansub only did 15 episodes out of 24, currently, I have to move the extra files somewhere else. It would be handy to be able to tell them to start it anyway and ignore the missing files. The top then is to make it understand by itself which file is associated (like if I have: base file 24 ep; new subs ep: 1, 2, 3, 5, 7, 10 etc.

the program will skip ep 4, 6, 8, 9 etc. Obviously this thing can only be applied if the files are numbered and with the same numbering (otherwise, the program will warn that it cannot continue because the files are not consistent).

  1. From my point of view and my opinion ... it wouldn't make sense. If out of 48 episodes, 37 is different ... the only way to find out who it is (and why) is to use algorithm 0 and then open the two original files and compare them to those of the first command. The usefulness of the analysis is to find out who is good and who is not. However, if I make it analyze only the first 3 episodes (or 7 or the number that can be selected), the function itself loses meaning.

  2. ? referred to what?

  3. One "problem" I have encountered is how the output information is provided. It would be more practical to have a more "compact" check result report. What I mean: When you run the check, this will give as the output image which is very convenient (for debugging) but takes up space and makes it more difficult to read all the "swag". It would be enough to report "Structure looks Ok." in a sort of table at the end of the check (self-generated by the program), such as:

Check result: Files for key - total is 12 Directory: Y: \ Rev \ Cautious Hero Final

Files for key - total is 12 Directory: Y: \ Rev \ Cautious Hero Final

Files for key - total is 12 Directory: Z: \ Anime Backups \ Cautious Hero \ Mad le Zisell

1st file: Structure looks Ok. 2nd file: Structure looks Ok. 3rd file: mismatch track. etc.

Extra: If for example the first 15 files are all "Structure looks Ok." instead of showing a repeating list, abbreviate to: Files 1-15: Structure looks Ok. 16th file: mismatch track. etc.

Such a thing. Because at the moment you have to go and search for the various information by scrolling (very difficult with batches of 100 episodes and maybe without even cleaning the output screen for who knows how long. With this I am not saying to remove 1. Source: ['Y:\\Rev\\Cautious Hero Final\\Cautious Hero - s01e01 - Ember [BD] + OCRD - This Hero Is Too Cautious.mkv', 'Z:\\Anime Backups\\Cautious Hero\\Mad le Zisell\\[Mad le Zisell] Shinchou Yuusha - Kono Yuusha ga Ore Tueee Kuse ni Shinchou Sugiru - 01 [720p].mks'] Destination: Y:\Rev\v\Cautious Hero - s01e01 - Ember [BD] + OCRD - This Hero Is Too Cautious.mkv Structure looks Ok.

but to leave it or even better to hide them all in a function like this:

Check result:

Full structure Output log: ``` 1. Source: ['Y:\\Rev\\Cautious Hero Final\\Cautious Hero - s01e01 - Ember [BD] + OCRD - This Hero Is Too Cautious.mkv', 'Z:\\Anime Backups\\Cautious Hero\\Mad le Zisell\\[Mad le Zisell] Shinchou Yuusha - Kono Yuusha ga Ore Tueee Kuse ni Shinchou Sugiru - 01 [720p].mks'] Destination: Y:\Rev\v\Cautious Hero - s01e01 - Ember [BD] + OCRD - This Hero Is Too Cautious.mkv Structure looks Ok. 2. Source: ['Y:\\Rev\\Cautious Hero Final\\Cautious Hero - s01e02 - Ember [BD] + OCRD - Too Much For a Novice Goddess to Bear.mkv', 'Z:\\Anime Backups\\Cautious Hero\\Mad le Zisell\\[Mad le Zisell] Shinchou Yuusha - Kono Yuusha ga Ore Tueee Kuse ni Shinchou Sugiru - 02.V2 [720p].mks'] Destination: Y:\Rev\v\Cautious Hero - s01e02 - Ember [BD] + OCRD - Too Much For a Novice Goddess to Bear.mkv Structure looks Ok. 3. Source: ['Y:\\Rev\\Cautious Hero Final\\Cautious Hero - s01e03 - Ember [BD] + OCRD - This Hero Is Too Self-Serving.mkv', 'Z:\\Anime Backups\\Cautious Hero\\Mad le Zisell\\[Mad le Zisell] Shinchou Yuusha - Kono Yuusha ga Ore Tueee Kuse ni Shinchou Sugiru - 03 [720p].mks'] Destination: Y:\Rev\v\Cautious Hero - s01e03 - Ember [BD] + OCRD - This Hero Is Too Self-Serving.mkv Structure looks Ok. ```

Files for key - total is 24 Directory: Y: \ Rev \ Cautious Hero Final

Files for key - total is 24 Directory: Y: \ Rev \ Cautious Hero Final

Files for key - total is 24 Directory: Z: \ Anime Backups \ Cautious Hero \ Mad le Zisell

Files 1-15: Structure looks Ok. 16th file: mismatch track. Files 17-21: Structure looks Ok. Files 22-23: mismatch track. 24th file: Structure looks Ok.


Last (but not least), the output colors 👌👍🔝. Very often they are underestimated but they are really useful and make reading more effective.

Leaving aside the big change of point 5, I would put the command

1. Source: ['Y:\\Rev\\Cautious Hero Final\\Cautious Hero - s01e01 - Ember [BD] + OCRD - This Hero Is Too Cautious.mkv', 'Z:\\Anime Backups\\Cautious Hero\\Mad le Zisell\\[Mad le Zisell] Shinchou Yuusha - Kono Yuusha ga Ore Tueee Kuse ni Shinchou Sugiru - 01 [720p].mks']
Destination: Y:\Rev\v\Cautious Hero - s01e01 - Ember [BD] + OCRD - This Hero Is Too Cautious.mkv
+ Structure looks Ok.

gray or in any case not of the colors used to signal if everything is OK or not. The command turns gray and the result of the command turns green / yellow / red based on the result (Structure looks Ok. green).

akai10tsuki commented 3 years ago

Hi,

Originally those buttons were for me that is why is so verbose. It is as you say for debugging purposes. The buttons were hidden for the users and manually enabled.

When checking I go in stages like:

  1. check the command to have a sound structure if not just stop this will be really the no check there is no IO

  2. check the number of files match. Here the IO starts but I only read the directories is relatively fast. And it was what was doing.

  3. then really heavy work

  4. Give options to select 1, 2 or 3

This means select one of the above no checking, check for file number match or just check the first two or three.

Under the hood for me will be option 4. No checking really is no problem at all if you more or less are an experienced user really take the time to understand how the program works. And when I say it may not work is not like the check is needed is more like I don't check user submits the job and then knows there is a problem. Experienced user won't suffer for that. One gets used to check beforehand or really prepare the source files the way the program can work with them without problems. Users that are not that experience in using computers can get discourage and may abandon the idea that the program is useful at all. Once you now what to expect no checking is really the way to go.

Thank you for the time and your input is greatly appreciated. Remember there is a wish list you can suggest changes and additions and hope that I can do them.

I already started working with this and it is a little to interesting. Right now it is not triggering the full check on some occasions and crashing. At least is crashing on me the problem right now is the time I won't be able to do much until after 3 or 4 days.

The output for the analysis make take considerable time I have to workaround the GUI issues if the analysis takes long and there is no output is no good.

VegethB commented 3 years ago

Under the hood for me will be option 4. No checking really is no problem at all if you more or less are an experienced user really take the time to understand how the program works. And when I say it may not work is not like the check is needed is more like I don't check user submits the job and then knows there is a problem. Experienced user won't suffer for that. One gets used to check beforehand or really prepare the source files the way the program can work with them without problems. Users that are not that experience in using computers can get discourage and may abandon the idea that the program is useful at all. Once you now what to expect no checking is really the way to go.

Ah ... ok then you should just put the settings override switch (so the expert user will know that he is disabling the analysis).

Remember there is a wish list

🤔 Where is it? The github "discussions" page is not there. The projects page is empty and in the issues there is nothing open that means "wish list".

Surely it's me (I'm always getting lost on these things) but I can't find the wish list.

Thank you for the time and your input is greatly appreciated

I thank you for sharing this life-saving program with the web. Usually these tools are only ever immense and incomprehensible command lines. If it weren't for this software I would have given up making mkv muxes.

akai10tsuki commented 5 months ago

Closing old issues if still using new version and the problem persist please reopen.