Pietracoops / yugioh_cardlist_scraper

Yugioh Card Database Generator offline CSV: Simple python script that is used to scrape the KONAMI website to acquire a complete list of all yugioh cards (and their respective card information) into csv files. This can serve as a great tool for developers interested in the yugioh domain.
17 stars 2 forks source link

Languages #1

Open blal1 opened 1 year ago

blal1 commented 1 year ago

Hi When I have added this parameter request_locale=fr to download the french cards I got only the packs names in french but the cards still in English. How can I got them in french? Thanks

Pietracoops commented 1 year ago

Hey!

First off, thanks so much for the support, I appreciate it :)

Secondly, this is a great question, and you are on the right track. I saw your email last week but my hands have been tied up with some other things and did not get a chance to respond.

I have taken a look to see how this is possible and I have made some modifications that I have not yet pushed but I'll push it onto a second experimental branch tonight. To give you some context, if you have taken a look at the code, I am scraping two websites for all the card information. The modifications I have made gather the card information in french, but not the additional information from the secondary website because it does not seem to have that information in french. So the fix I can send you is simply for the card information, but I haven't extensively tested it so there may be some bugs.

I'll send you more information later on tonight when I push the code!

On 2/28/2023 5:12 PM, blal1 wrote:

Hi When I have added this parameter request_locale=fr to download the french cards I got only the packs names in french but the cards still in English. How can I got them in french? Thanks

— Reply to this email directly, view it on GitHub https://github.com/Pietracoops/yugioh_cardlist_scraper/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG5A6EZD77WPOZMRI6EZNZDWZZZ5NANCNFSM6AAAAAAVLJEKY4. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Pietracoops commented 1 year ago

New branch has been created called experimental-language. Checkout this branch and call the script as instructed in the readme.md file. Use the parameter "--language fr". I'm still running tests, and working out some of the code - though give it a try, and if you run into any issues list them here and I'll tackle them.

blal1 commented 1 year ago

Hi, Thanks for your work Now, the scraper works correctly and I can download the french cards without any problems For the wiki, the url is different then the English's

https://yugioh.fandom.com/fr/wiki/Wiki_Yu-Gi-Oh! That is the french Ygo wiki So, sorry about my English

Pietracoops commented 1 year ago

No problem, I also speak french so you can send me emails in french if it makes it easier.

I pushed some more changes this morning as I noticed the script was crashing on card sets that were not fully released just yet (Maze of Memories - march 10th release), so if you run into that issue, pull from the branch.

Also, I've been playing around with the languages on fandom and noticed that they dont have multilingual card pages (e.g. https://yugioh.fandom.com/wiki/Flame_Manipulator -> https://yugioh.fandom.com/fr/wiki/Flame_Manipulator - this doesn't work). So there is still work to be done on this and will only be available in english for now.

blal1 commented 1 year ago

Hello, I tested the program and encountered a bug. When he downloads the French cards and when he downloads the 144th pack, he displays the following errors before exiting the program. Packs Processed |█████████▏⚠︎ | (!) 144/632 [23%] in 2:32.5 (0.95/s) Traceback (most recent call last): File "C:\yugioh_cardlist_scraper-experimental-language\yugioh_cardlist_scraper-experimental-language\main.py", line 94, in if detect(english_name) != 'en': ^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Eleve\AppData\Local\Programs\Python\Python311\Lib\site-packages\langdetect\detector_factory.py", line 130, in detect return detector.detect() ^^^^^^^^^^^^^^^^^ File "C:\Users\Eleve\AppData\Local\Programs\Python\Python311\Lib\site-packages\langdetect\detector.py", line 136, in detect probabilities = self.get_probabilities() ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Eleve\AppData\Local\Programs\Python\Python311\Lib\site-packages\langdetect\detector.py", line 143, in get_probabilities self._detect_block() File "C:\Users\Eleve\AppData\Local\Programs\Python\Python311\Lib\site-packages\langdetect\detector.py", line 150, in _detect_block raise LangDetectException(ErrorCode.CantDetectError, 'No features in text.') langdetect.lang_detect_exception.LangDetectException: No features in text.

C:\Users\Eleve>Python C:\yugioh_cardlist_scraper-experimental-language\yugioh_cardlist_scraper-experimental-language\main.py --language fr

blal1 commented 1 year ago

Hello! Sorry for inconvenience, I tested the programm with experimental languages. But when I'm trying to download the french packs I got the following errors. Especially, when he's trying to download the 144th pack. After that, the programm doesn't continue with downloding the next packs. The following lines are the errors.

Packs Processed |█████████▏⚠︎ | (!) 144/634 [23%] in 2:25.8 (0.99/s) Traceback (most recent call last): File "C:\ygo\ygo\main.py", line 94, in if detect(english_name) != 'en': ^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Bil\AppData\Local\Programs\Python\Python311\Lib\site-packages\langdetect\detector_factory.py", line 130, in detect return detector.detect() ^^^^^^^^^^^^^^^^^ File "C:\Users\Bil\AppData\Local\Programs\Python\Python311\Lib\site-packages\langdetect\detector.py", line 136, in detect probabilities = self.get_probabilities() ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Bil\AppData\Local\Programs\Python\Python311\Lib\site-packages\langdetect\detector.py", line 143, in get_probabilities self._detect_block() File "C:\Users\Eleve\AppData\Local\Programs\Python\Python311\Lib\site-packages\langdetect\detector.py", line 150, in _detect_block raise LangDetectException(ErrorCode.CantDetectError, 'No features in text.') langdetect.lang_detect_exception.LangDetectException: No features in text.

Thanks,

Pietracoops commented 1 year ago

Try pulling the latest changes, it should resolve that issue! It had to do with a feature that I tried to implement to catch certain Japanese edge cases - though when it encounters a card that is simply "7", it fails to find enough features to detect what language it is.

enamespace commented 10 months ago

Hi, I want to download Chinese version cards, but it seems not supported for now.