B-M-dev / Bilingual_Manga-home-

The Offline version of Bilingual Manga
https://bilingualmanga.org/
11 stars 1 forks source link

Feature request: Source code for further improvements #1

Closed mjuhanne closed 10 months ago

mjuhanne commented 10 months ago

First of all, I want to thank you for the great service that you do for all of us learning Japanese via manga! (I became a patron just a while ago).

I'd like to further improve the Bilingual Manga catalog browser by adding several new functionality:

I've already written a couple of Python scripts that do all of these but I'd really like to implement these on the web catalog itself. However the code here in Github contains mostly the compiled Svelte code so changing the GUI is very hard, if not impossible. Would you be willing to share the original Svelte source code as well?

The work I've done so far:

Example:

....
  357  %known_w 93.5 %known_k 99.0 | R 8.5 /1099 | JLPT 33 16 13  2  5 (27) |[#13/2097p w/p 23 %k69] Taiyou no Ie
* 358  %known_w 93.8 %known_k 99.3 | R 8.6 /3581 | JLPT 32 12 18  2  8 (26) |[#12/2553p w/p 47 %k80] Death Note
  359  %known_w 93.8 %known_k 99.3 | R 7.4 /1556 | JLPT 32 13 15  2  6 (29) |[#25/5007p w/p 36 %k70] Nisekoi
  360  %known_w 93.9 %known_k 99.3 | R 8.6 /1497 | JLPT 27 11 19  2  9 (29) |[#20/4196p w/p 45 %k73] Liar Game
  361  %known_w 94.0 %known_k 99.4 | R 8.5 /2588 | JLPT 30 12 15  2  8 (30) |[#20/3927p w/p 65 %k75] Bakuman
  362  %known_w 94.1 %known_k 99.3 | R 8.0 /1441 | JLPT 32 18 10  1  5 (31) |[# 8/1412p w/p 17 %k58] Chobits
  363  %known_w 94.1 %known_k 99.3 | R 7.5 / 222 | JLPT 34 16 14  2  7 (24) |[# 7/1080p w/p 30 %k73] One Week Friends
  364  %known_w 94.2 %known_k 99.2 | R 7.0 / 239 | JLPT 28 13 18  2  7 (29) |[#21/4264p w/p 23 %k77] High-Rise Invasion
* 365  %known_w 94.2 %known_k 99.4 | R 8.8 /1740 | JLPT 31 13 14  2  7 (29) |[#22/3840p w/p 26 %k72] 20th Century Boys
  366  %known_w 94.2 %known_k 99.5 | R 8.4 / 980 | JLPT 24 11 14  2  7 (38) |[#138/26153p w/p 24 %k72] Hajime no Ippo
  367  %known_w 94.2 %known_k 99.3 | R 7.2 /  41 | JLPT 34 13 14  2  6 (28) |[# 6/1232p w/p 22 %k74] Orange
  368  %known_w 94.4 %known_k 99.4 | R 8.2 /1283 | JLPT 30 16 13  2  7 (30) |[#28/5146p w/p 29 %k68] Yamada-kun and the Seven Witches
* 369  %known_w 94.4 %known_k 99.6 | R 8.2 /1904 | JLPT 28 13 10  2  6 (38) |[# 7/2654p w/p 44 %k58] Love Hina
  370  %known_w 94.8 %known_k 99.0 | R 7.8 /  77 | JLPT 39 18 12  1  6 (21) |[# 3/ 544p w/p 21 %k72] I Had That Same Dream Again
* 371  %known_w 94.9 %known_k 99.6 | R 8.2 /1028 | JLPT 32 13 14  2  7 (29) |[#103/19167p w/p 56 %k79] Detective Conan

I've already read Love Hina and 20th Century Boys and some Detective Conan which explains their 94-95% comprehension. There's also a JLPT word analysis (percentages per level). I've already starred Death Note for further reading because it has a high rating of 8.6 (from mangaupdates.com). In case of Death note, the JLPT word distribution is 32% (Level 5), 8% (Level 1) and 26% non-JLPT words (in parenthesis). There are 12 volumes and 2553p with average word/page is 47 and kanji/word-ratio of 80% which are a tad above average but known word/kanji percentage is 93.%/99.3% so shouldn't be a too big challenge.

If user wants to just study for JLPT, there are many ways to select the appropriate manga:

Sort the manga titles by content. Here's the manga for maximum JLPT content (minimum non-JLPT words):

...
* 358  %known_w 93.8 %known_k 99.3 | R 8.6 /3581 | JLPT 32 12 18  2  8 (26) |[#12/2553p w/p 47] Death Note
  359  %known_w 89.5 %known_k 98.3 | R 6.8 /  54 | JLPT 35 15 14  2  5 (26) |[# 2/ 383p w/p 20] The Girl Who Runs Through Time
  360  %known_w 91.5 %known_k 98.7 | R 8.8 /1690 | JLPT 36 14 13  2  6 (26) |[#13/2585p w/p 24] A Bride's Story
  361  %known_w 91.5 %known_k 98.6 | R 7.7 / 704 | JLPT 34 16 13  3  5 (25) |[# 1/ 302p w/p 19] Nijigahara Holograph
  362  %known_w 88.8 %known_k 98.2 | R 6.6 /  13 | JLPT 28 17 17  2  8 (25) |[# 3/ 625p w/p 19] Shounen Shoujo
  363  %known_w 92.0 %known_k 99.0 | R 8.3 / 257 | JLPT 37 16 13  2  4 (25) |[# 6/ 990p w/p 13] Girls' Last Tour
  364  %known_w 89.8 %known_k 97.4 | R 7.4 / 147 | JLPT 36 15 15  1  6 (25) |[# 4/ 774p w/p 13] xxxHOLiC: Rei
  365  %known_w 94.1 %known_k 99.3 | R 7.5 / 222 | JLPT 34 16 14  2  7 (24) |[# 7/1080p w/p 30] One Week Friends
  366  %known_w 93.1 %known_k 99.0 | R 7.6 / 559 | JLPT 36 16 12  2  7 (24) |[#14/2746p w/p 26] The Quintessential Quintuplets
  367  %known_w 90.6 %known_k 97.9 | R 8.8 /2370 | JLPT 33 14 17  1  8 (23) |[# 1/ 248p w/p 23] Watashitachi no Shiawase na Jikan
  368  %known_w 92.9 %known_k 99.0 | R 8.7 / 467 | JLPT 35 18 13  2  6 (23) |[#45/1500p w/p 17] Bloom Into You
  369  %known_w 94.8 %known_k 99.0 | R 7.8 /  77 | JLPT 39 18 12  1  6 (21) |[# 3/ 544p w/p 21] I Had That Same Dream Again
  370  %known_w 91.7 %known_k 98.4 | R 8.5 / 512 | JLPT 34 16 15  3  8 (21) |[#18/ 607p w/p 23] I sold my life for ten thousand yen per year.
  371  %known_w 92.6 %known_k 98.7 | R 7.8 / 160 | JLPT 35 21 13  2  5 (20) |[#10/ 446p w/p 20] I want to eat your pancreas

Here's the manga for intermediate level (maximum JLPT 2 and 3 word content):

...
* 361  %known_w 93.8 %known_k 99.3 | R 8.6 /3581 | JLPT 32 12 18  2  8 (26) |[#12/2553p w/p 47] Death Note
  362  %known_w 87.6 %known_k 98.8 | R 7.8 / 271 | JLPT 27 13 18  2  7 (30) |[# 3/ 603p w/p 42] Level E
  363  %known_w 86.6 %known_k 99.1 | R 7.8 / 404 | JLPT 22 12 18  2  7 (35) |[# 5/1085p w/p 16] Devilman
  364  %known_w 86.5 %known_k 97.0 | R 6.4 /   7 | JLPT 28 12 18  3  8 (29) |[# 1/ 189p w/p  5] NOiSE
  365  %known_w 88.7 %known_k 98.9 | R 8.2 /1182 | JLPT 26 12 18  3  9 (29) |[#21/4111p w/p 29] Assassination Classroom
  366  %known_w 91.6 %known_k 98.9 | R 8.1 / 245 | JLPT 27 10 19  2  9 (30) |[#49/10187p w/p 27] Usogui
  367  %known_w 93.9 %known_k 99.3 | R 8.6 /1497 | JLPT 27 11 19  2  9 (29) |[#20/4196p w/p 45] Liar Game
  368  %known_w 92.8 %known_k 99.0 | R 6.4 / 245 | JLPT 29 13 19  2  8 (26) |[#14/2894p w/p 18] Platinum End
  369  %known_w 90.6 %known_k 99.0 | R 8.7 /1897 | JLPT 25 11 19  2  9 (31) |[#46/7348p w/p 30] Hunter x Hunter
  370  %known_w 89.0 %known_k 98.4 | R -1.0 /   0 | JLPT 25 10 20  2 11 (29) |[# 2/ 388p w/p 23] Mein Kampf
  371  %known_w 80.4 %known_k 95.8 | R 7.0 / 325 | JLPT 20  9 20  2 10 (36) |[# 1/ 237p w/p 19] Giganto Maxia

.. and advanced level (maximum JLPT 1 word content):

 361  %known_w 90.3 %known_k 98.2 | R 6.9 /  49 | JLPT 27 12 17  2 10 (29) |[# 9/1557p w/p 14] Aposimz
  362  %known_w 87.3 %known_k 98.3 | R 8.4 /1553 | JLPT 24 11 16  2 10 (34) |[#18/3517p w/p 19] Akumetsu
  363  %known_w 87.5 %known_k 98.0 | R 8.0 / 830 | JLPT 23 11 15  2 10 (36) |[#16/3744p w/p 25] Tokyo Ghoul:re
  364  %known_w 89.0 %known_k 98.6 | R 8.1 / 381 | JLPT 26 13 14  2 10 (31) |[#22/4038p w/p 26] BEASTARS
  365  %known_w 91.2 %known_k 98.4 | R 8.2 /1000 | JLPT 29 11 17  3 10 (27) |[#10/2131p w/p  5] Blame!
  366  %known_w 89.0 %known_k 98.1 | R 7.0 /  63 | JLPT 27 11 16  2 10 (32) |[# 6/1391p w/p 33] Umineko When They Cry Episode 5: End of the Golden Witch
  367  %known_w 90.8 %known_k 98.8 | R 7.2 / 309 | JLPT 25 11 16  2 10 (33) |[#34/6816p w/p 20] Fire Force
  368  %known_w 87.6 %known_k 97.7 | R 7.7 / 398 | JLPT 29 11 16  2 10 (29) |[# 6/1252p w/p  7] Biomega
  369  %known_w 88.5 %known_k 98.7 | R 8.0 /1105 | JLPT 23 11 17  3 10 (33) |[#38/7601p w/p 30] My Hero Academia
  370  %known_w 91.1 %known_k 98.6 | R 7.9 / 495 | JLPT 26 10 16  3 11 (32) |[#16/2825p w/p 17] Knights of Sidonia
  371  %known_w 89.0 %known_k 98.4 | R -1.0 /   0 | JLPT 25 10 20  2 11 (29) |[# 2/ 388p w/p 23] Mein Kampf

Another method is the 'highest bang-for-the-buck' method, which takes into account the current known words, the number of new words likely to be learned, and the effort to read any given title (total words in the whole series and its difficulty):

...
* 359  %known_w 86.0 %known_k 98.3 | R 8.6 / 789 | JLPT 25 12 14  2  8 (35) |[#10/1658p w/p 33 %k64] Spy x Family  (994 point impr)
  360  %known_w 90.6 %known_k 98.7 | R 7.1 /  85 | JLPT 33 16 12  3  6 (27) |[# 1/ 124p w/p 62 %k66] K-On! College  (78 point impr)
  361  %known_w 88.6 %known_k 98.8 | R 8.8 /2095 | JLPT 33 14  8  2  6 (34) |[#105/3336p w/p 15 %k33] Yotsuba&!  (1180 point impr)
  362  %known_w 92.5 %known_k 99.3 | R 8.0 / 254 | JLPT 29 14 13  2  9 (31) |[#13/1934p w/p 54 %k73] Working!!  (1002 point impr)
  363  %known_w 88.5 %known_k 98.9 | R 8.3 / 411 | JLPT 30 12 12  2  6 (35) |[#11/1465p w/p 52 %k62] Honey and Clover  (1018 point impr)
  364  %known_w 90.5 %known_k 98.6 | R 7.6 / 120 | JLPT 34 15 12  2  5 (29) |[#11/1363p w/p 47 %k67] A-Channel  (774 point impr)
  365  %known_w 89.9 %known_k 99.0 | R 8.7 / 335 | JLPT 34 13 13  2  6 (28) |[# 5/ 778p w/p 46 %k77] Kakukaku Shikajika  (452 point impr)
  366  %known_w 90.5 %known_k 98.8 | R 7.5 / 104 | JLPT 33 14 13  2  8 (28) |[# 6/ 736p w/p 55 %k72] Hakoiri Drops  (464 point impr)
  367  %known_w 88.8 %known_k 98.3 | R 8.3 / 129 | JLPT 30 15 13  2  7 (31) |[# 4/ 381p w/p 43 %k73] Will you marry me again if you are reborn?  (236 point impr)
* 368  %known_w 91.4 %known_k 98.5 | R 7.8 /  95 | JLPT 30 15 13  2  7 (30) |[# 8/ 746p w/p 36 %k70] A Man & His Cat  (352 point impr)
  369  %known_w 91.5 %known_k 99.1 | R 7.9 /  70 | JLPT 23 14 12  2  5 (41) |[#14/2666p w/p 47 %k42] Barefoot Gen  (1572 point impr)
  370  %known_w 87.5 %known_k 98.8 | R 8.1 / 251 | JLPT 24 13 15  2  6 (38) |[#10/1892p w/p 31 %k46] Rose of Versailles  (1247 point impr)
  371  %known_w 87.7 %known_k 99.6 | R 8.7 / 399 | JLPT 30 15 11  3  5 (34) |[#11/2044p w/p 33 %k33] Doraemon  (1707 point impr)

Here it suggests me to read Doraemon, which would otherwise be fine but it has has a horrible kanji to word-ratio (k%33) so it's mostly hiragana where the average is between 60-70%).

'Spy x Family' is popular and would be useful for JLPT so I've tagged it for further reading, but with 86% comprehension it would be now a bit too difficult. What should I read before that? The system reads each manga for you and reports the improvement results:

* 360  %known_w 86.4 %known_k 98.5 | R 8.6 / 789 | JLPT 25 12 14  2  8 (35) |[#10/1658p w/p 33 %k64] Spy x Family [0.5 impr after 8/746 volumes/pages of A Man & His Cat with compr 91.4% (R 7.82)]
  361  %known_w 88.0 %known_k 98.8 | R 8.6 / 789 | JLPT 25 12 14  2  8 (35) |[#10/1658p w/p 33 %k64] Spy x Family [2.0 impr after 16/3106 volumes/pages of Dengeki Daisy with compr 91.5% (R 8.67)]
  362  %known_w 87.4 %known_k 98.6 | R 8.6 / 789 | JLPT 25 12 14  2  8 (35) |[#10/1658p w/p 33 %k64] Spy x Family [1.4 impr after 13/1934 volumes/pages of Working!! with compr 92.5% (R 8.0)]
  363  %known_w 87.1 %known_k 98.7 | R 8.6 / 789 | JLPT 25 12 14  2  8 (35) |[#10/1658p w/p 33 %k64] Spy x Family [1.1 impr after 11/1363 volumes/pages of A-Channel with compr 90.5% (R 7.58)]
  364  %known_w 86.6 %known_k 98.5 | R 8.6 / 789 | JLPT 25 12 14  2  8 (35) |[#10/1658p w/p 33 %k64] Spy x Family [0.6 impr after 5/908 volumes/pages of Paradise Kiss with compr 90.5% (R 8.44)]
  365  %known_w 86.2 %known_k 98.4 | R 8.6 / 789 | JLPT 25 12 14  2  8 (35) |[#10/1658p w/p 33 %k64] Spy x Family [0.3 impr after 2/390 volumes/pages of Beast Master with compr 90.2% (R 8.48)]
  366  %known_w 86.7 %known_k 98.5 | R 8.6 / 789 | JLPT 25 12 14  2  8 (35) |[#10/1658p w/p 33 %k64] Spy x Family [0.7 impr after 6/736 volumes/pages of Hakoiri Drops with compr 90.5% (R 7.47)]
  367  %known_w 86.8 %known_k 98.6 | R 8.6 / 789 | JLPT 25 12 14  2  8 (35) |[#10/1658p w/p 33 %k64] Spy x Family [0.8 impr after 5/625 volumes/pages of Bocchi the Rock! with compr 88.9% (R 7.44)]
  368  %known_w 86.1 %known_k 98.4 | R 8.6 / 789 | JLPT 25 12 14  2  8 (35) |[#10/1658p w/p 33 %k64] Spy x Family [0.1 impr after 1/124 volumes/pages of K-On! College with compr 90.6% (R 7.12)]
  369  %known_w 86.4 %known_k 98.4 | R 8.6 / 789 | JLPT 25 12 14  2  8 (35) |[#10/1658p w/p 33 %k64] Spy x Family [0.4 impr after 4/381 volumes/pages of Will you marry me again if you are reborn? with compr 88.8% (R 8.33)]
  370  %known_w 86.1 %known_k 98.4 | R 8.6 / 789 | JLPT 25 12 14  2  8 (35) |[#10/1658p w/p 33 %k64] Spy x Family [0.1 impr after 1/124 volumes/pages of K-On! Highschool with compr 90.6% (R 7.03)]
* 371  %known_w 91.0 %known_k 99.1 | R 8.6 / 789 | JLPT 25 12 14  2  8 (35) |[#10/1658p w/p 33 %k64] Spy x Family [5.0 impr after 10/1658 volumes/pages of Spy x Family with compr 86.0% (R 8.61)]

Reading 13 volumes of 'Working' or 11 volumes of 'A-channel' would improve comprehension by 1.4 and 1.1 percentage points respectively. Or maybe just tackle the problem head on and start reading Spy x Family which would result in a comprehension of 91%. This means that the manga contains a fair amount of repetition but naturally you don't retain ALL the unknown words in the first reading.

B-M-dev commented 10 months ago

Here you go https://github.com/B-M-dev/Bilingual_Manga-home--src- .The code is a mess and has no comments in it.We use mongodb instead of json files for the actual site.

Feel free to contribute and maintain the code.

mjuhanne commented 10 months ago

Thanks! I'll take a look

B-M-dev commented 10 months ago

Just saw your fork.I suggest you put the rating data inside admin.manga_meta.json(inside manga_titles array) instead of a separate ratings.json file(as it is not that big and looks more organized) and can you please share the python scripts and dictionaries you are using for this project.

mjuhanne commented 10 months ago

Thanks for the suggestion. I did some refactoring, but I'd still like to use a separate ratings.json because unlike the static admin.* files the ratings data is dynamic. Currently it's updated by a separate Python script (in my repo now) but maybe in the future automatically by code in Svelte, so the file would act as a cache to avoid unnecessary traffic to mangaupdates.com

I'll make a PR so let's continue discussion over there