Closed Belzak56 closed 2 years ago
Are you running latest beta?
Could you email or upload the sub?
I'm using the latest beta: https://github.com/SubtitleEdit/subtitleedit/releases/download/3.5.4/SubtitleEditBeta.zip
The subtitle is below: Mv84.zip
Thx for the sub - beta updated: https://github.com/SubtitleEdit/subtitleedit/releases/download/3.5.4/SubtitleEditBeta.zip
2+ 3 should be improved.
if "1" still occurs do you have some samples?
2 + 3 are working perfectly now. Thanks Niske!
Unfortunately, other issue have occurred.
The OCR recognition is now ignoring: The quotation mark " The Arabic Comma ، The Letter (Alef) The Parenthesis (some times)
Noting that Max. error % : 0.0
below are some examples:
## Line 1 of Subtitle:
## Line 138 of Subtitle:
## Line 139 of Subtitle:
As for "1" it seems that this is resulting from the option "Right to left" is ticked. If this option is not ticked then the order of the numbers will be correct. Now I don't know how this can be fixed as "Right to left" option is necessary for Arabic Language. I would suggest if it is possible to treat digits only as "LTR" instead of "RTL".
This can be seen in the lines: 202 - 204 - 206 - 209 - 309 - 310 - 311 - 312 of the same subtitle attached in the previous post.
Beta updated: https://github.com/SubtitleEdit/subtitleedit/releases/download/3.5.4/SubtitleEditBeta.zip
Did you have a line where I could test the "reversed numbers issue" ?
This Beta works perfectly. At least for this font. Thanks a lot Nikse.
For the reversed numbers issue, first please find the below table for Arabic and English Numbers for your reference:
Below are samples from the same subtitle:
## Line 202: Recognized: 30:12 Correct: 12:30
## Line 204: Recognized: 552 and 64 2 Correct: 255 and 642
## Line 206: Recognized: 070 Correct: 700
## Line 209: Recognized: 70 0 Correct: 700
Please note that this is happening in both cases of choosing Arabic or English Numbers for the output text.
@niksedk
I did intensive tests on the latest beta the you've provided:
https://github.com/SubtitleEdit/subtitleedit/releases/download/3.5.4/SubtitleEditBeta.zip
My observations are below:
The Overlapping issue have been resolved for this font. It is happening very rarely now and it can be handled very easily. For other fonts it is not happening at all.
I've managed to resolve the issue of "Reversed Numbers" by using Regx.
Regx is very useful and helped me to tackle many minor issues, but I would like to ask if it is possible to add "Comments" or "Remarks" Tab at the end near "Search Type" Tab. I know that I can write a comment in the same line by adding (?# ..... ), but it would be much easier if the comments were in a separate column.
below are (2) examples from this subtitle:
## Line 10, No of pixels is space: 4
## Line 10, No of pixels is space: 5
## Line 305, No of pixels is space: 4
## Line 305, No of pixels is space: 5
My suggestion to resolve this issue is to add decimals in "No of pixels is space" settings, but I don't know if this is applicable programmatically.
As advised here:
https://github.com/SubtitleEdit/subtitleedit/issues/2643
I've tried "Binary Image Compare" and it was very helpful indeed.
Nevertheless, there was few issues I would like to point out here:
Numbers have been recognized reversely when the value of "No of pixels is space" is (5) e.g. 365 will be read as 536 If the value of "No of pixels is space" changed to (7) this won't happen, but it will lead to another problem which is explained in point no. 2 below. I'm guessing this might be resolved by allowing decimals in "No of pixels is space" settings.
The issue of (2) lines overlapping. I've been able to overcome this issue manually by choosing a value of (50) for "Min. line height (split)" and when I'm facing another two lines overlapping I'm reducing the value to (40) or (45) and so on. I would appreciate if you could allow for the value of "Min. line height (split)" to be incremental by +1 from 40 till 50 instead of +5 to achieve the ideal line height instead of going back and forth to adjust it manually.
There are two letters (Ra'a) and (Zai, same as Ra'a but with dot above it) in Arabic on which if the letter (Alef) and the Parentheses comes after them they will be recognized as space as shown below:
I'm not sure why this is happening only for letter (Alef) and the Parentheses and not for any other letters, but it might be related to the letter (Alef) and Parentheses width and that letter (Alef) is coming within the range of letter ((Ra'a) and (Zai).