Kareadita / Kavita

Kavita is a fast, feature rich, cross platform reading server. Built with the goal of being a full solution for all your reading needs. Setup your own server and share your reading collection with your friends and family.
http://www.kavitareader.com
GNU General Public License v3.0
6.44k stars 341 forks source link

Mismatch problem when series name in Comicinfo.xml uses Emoji characters only #3314

Open SteveXu9102 opened 3 weeks ago

SteveXu9102 commented 3 weeks ago

What happened?

The ScannerService module seems to match emoji-characters-only series name to every other normal series name strings when reading information from Comicinfo.xml.

(By 'normal', I mean strings without emoji characters or those including both normal characters and emoji characters. )

This reproducible problem occurs when I force a scan of a Manga library, and might have caused some of my .cbz files to be abnormally ignored.

The xml files related are here: File 1 File 2 File 3 File 4

The problem may have something in common with Issue #2976.

This is because both an emoji character and a full-width dot went disappeared in a scanner output.

[Kavita] [2024-10-27 22:17:43.907 +08:00  41] [Fatal] API.Services.Tasks.ScannerService [ScannerService] Matches: ジャンヌオルタ matches on ジャンヌ・オルタ🔥

I guess emoji-characters-only series name string in my case became blank in the same process, and matched all other strings for some reason, which is very weird.

What did you expect?

The module should have the ability to recognize and process most UTF-8 characters in comicinfo.xml instead of deleting the uncognized ones from loaded strings.

Kavita Version Number - If you don not see your version number listed, please update Kavita and see if your issue still persists.

0.8.3.2

What operating system is Kavita being hosted from?

Windows

If the issue is being seen on Desktop, what OS are you running where you see the issue?

Windows

If the issue is being seen in the UI, what browsers are you seeing the problem on?

No response

If the issue is being seen on Mobile, what OS are you running where you see the issue?

None

If the issue is being seen on the Mobile UI, what browsers are you seeing the problem on?

No response

Relevant log output

......
[Kavita] [2024-10-27 22:17:43.905 +08:00  41] [Fatal] API.Services.Tasks.ScannerService [ScannerService] Duplicate Series in DB matches with ジャンヌオルタ: ジャンヌ・オルタ🔥
[Kavita] [2024-10-27 22:17:43.905 +08:00  41] [Fatal] API.Services.Tasks.ScannerService [ScannerService] Duplicate Series in DB matches with ジャンヌオルタ: 😈💜
[Kavita] [2024-10-27 22:17:43.906 +08:00  41] [Fatal] API.Services.Tasks.ScannerService [ScannerService] ジャンヌオルタ matches against multiple series in the parsed series. This indicates a critical kavita issue. Key will be skipped
System.InvalidOperationException: Sequence contains more than one matching element
   at System.Linq.ThrowHelper.ThrowMoreThanOneMatchException()
   at System.Linq.Enumerable.TryGetSingle[TSource](IEnumerable`1 source, Func`2 predicate, Boolean& found)
   at API.Services.Tasks.Scanner.ParseScannedFiles.TrackSeries(ConcurrentDictionary`2 scannedSeries, ParserInfo info) in C:\Users\josep\Documents\Projects\KavitaOrg\Kavita\API\Services\Tasks\Scanner\ParseScannedFiles.cs:line 272
[Kavita] [2024-10-27 22:17:43.907 +08:00  41] [Fatal] API.Services.Tasks.ScannerService [ScannerService] Matches: ジャンヌオルタ matches on ジャンヌ・オルタ🔥
[Kavita] [2024-10-27 22:17:43.907 +08:00  41] [Fatal] API.Services.Tasks.ScannerService [ScannerService] Matches: ジャンヌオルタ matches on 😈💜
[Kavita] [2024-10-27 22:17:44.290 +08:00  41] [Fatal] API.Services.Tasks.ScannerService [ScannerService] Multiple series detected for 水着沖田さん (G:/hitomi_downloader_GUI/hitomi_downloaded/[Hara][Japanese]  水着沖田さん (1904997).cbz)! This is critical to fix! There should only be 1
System.InvalidOperationException: Sequence contains more than one matching element
   at System.Linq.ThrowHelper.ThrowMoreThanOneMatchException()
   at System.Linq.Enumerable.TryGetSingle[TSource](IEnumerable`1 source, Func`2 predicate, Boolean& found)
   at API.Services.Tasks.Scanner.ParseScannedFiles.MergeName(ConcurrentDictionary`2 scannedSeries, ParserInfo info) in C:\Users\josep\Documents\Projects\KavitaOrg\Kavita\API\Services\Tasks\Scanner\ParseScannedFiles.cs:line 322
[Kavita] [2024-10-27 22:17:44.291 +08:00  41] [Fatal] API.Services.Tasks.ScannerService [ScannerService] Duplicate Series in DB matches with 水着沖田さん: 水着沖田さん✈
[Kavita] [2024-10-27 22:17:44.293 +08:00  41] [Fatal] API.Services.Tasks.ScannerService [ScannerService] Duplicate Series in DB matches with 水着沖田さん: 😈💜
[Kavita] [2024-10-27 22:17:44.295 +08:00  41] [Fatal] API.Services.Tasks.ScannerService [ScannerService] 水着沖田さん matches against multiple series in the parsed series. This indicates a critical kavita issue. Key will be skipped
System.InvalidOperationException: Sequence contains more than one matching element
   at System.Linq.ThrowHelper.ThrowMoreThanOneMatchException()
   at System.Linq.Enumerable.TryGetSingle[TSource](IEnumerable`1 source, Func`2 predicate, Boolean& found)
   at API.Services.Tasks.Scanner.ParseScannedFiles.TrackSeries(ConcurrentDictionary`2 scannedSeries, ParserInfo info) in C:\Users\josep\Documents\Projects\KavitaOrg\Kavita\API\Services\Tasks\Scanner\ParseScannedFiles.cs:line 272
[Kavita] [2024-10-27 22:17:44.296 +08:00  41] [Fatal] API.Services.Tasks.ScannerService [ScannerService] Matches: 水着沖田さん matches on 水着沖田さん✈
[Kavita] [2024-10-27 22:17:44.296 +08:00  41] [Fatal] API.Services.Tasks.ScannerService [ScannerService] Matches: 水着沖田さん matches on 😈💜
[Kavita] [2024-10-27 22:17:45.315 +08:00  41] [Fatal] API.Services.Tasks.ScannerService [ScannerService] Multiple series detected for Girly Hairy (G:/hitomi_downloader_GUI/hitomi_downloaded/[Heitai Gensui, Ishino Kanon][Chinese] Girly Hairy (1819726).cbz)! This is critical to fix! There should only be 1
System.InvalidOperationException: Sequence contains more than one matching element
   at System.Linq.ThrowHelper.ThrowMoreThanOneMatchException()
   at System.Linq.Enumerable.TryGetSingle[TSource](IEnumerable`1 source, Func`2 predicate, Boolean& found)
   at API.Services.Tasks.Scanner.ParseScannedFiles.MergeName(ConcurrentDictionary`2 scannedSeries, ParserInfo info) in C:\Users\josep\Documents\Projects\KavitaOrg\Kavita\API\Services\Tasks\Scanner\ParseScannedFiles.cs:line 322
......

Additional Notes

OS: Windows 11 x64 24H2 26100.2161; Unicode UTF-8 Global Language Support is on in Region Settings.

majora2007 commented 3 weeks ago

I'll take a look at this, but let you know ahead of time it's a low priority given it seems like an edge case on top of an edge case.