DawnbrandBots / yaml-yugi

A machine-readable, human-editable database of the Yu-Gi-Oh! Trading Card Game, Official Card Game, Master Duel, Rush Duel, Speed Duel.
https://dawnbrandbots.github.io/yaml-yugi/cards.json
GNU Affero General Public License v3.0
13 stars 2 forks source link

Korean input priority for card names and text #26

Open kevinlul opened 1 year ago

kevinlul commented 1 year ago
  1. https://github.com/DawnbrandBots/yaml-yugi-ko/blob/master/overrides.tsv (ultimately, this should not be needed)
  2. Yugipedia, if it contains ruby text
  3. Official database contents otherwise preferred

Highlight discrepancies between Yugipedia and the official database and correct Yugipedia

kevinlul commented 1 year ago

Need to wrap in LiteralScalarString

kevinlul commented 1 year ago

https://github.com/DawnbrandBots/yaml-yugi/actions/runs/4218732281/jobs/7323445739 download logs and take action on override items as appropriate

kevinlul commented 1 year ago

984 overrides are unnecessary: unnecessary.log (cc: @Ice-Pendragon)

kevinlul commented 1 year ago

680 Yugipedia-official name discrepancies. Many are differences in spacing or dash used, but some are typographic: discrepancy.log

kevinlul commented 1 year ago

4616 card text discrepancies, though these could be due to all sorts of factors like whitespace: text.log 99 for Pendulum text, ditto: pendulum.log

Ice-Pendragon commented 1 year ago

Need to wrap in LiteralScalarString

Now it looks great. Thanks!

kevinlul commented 1 year ago

We need to continue to refine this though or improce the quality on Yugipedia, as errata don't get entered in the Korean database :/ e.g. https://yugipedia.com/wiki/Card_Errata:Evolutionary_Bridge

kevinlul commented 1 year ago

Per 81428725ab16a163728de9251b64ed1c4e6e8d83 and https://github.com/DawnbrandBots/yaml-yugi/actions/runs/5568150780/jobs/10170569310#step:6:7, we can add an automated workflow to check for missing Korean (and Japanese!) translations.

kevinlul commented 1 year ago

Considering https://docs.github.com/en/actions/managing-issues-and-pull-requests/scheduling-issue-creation

Ice-Pendragon commented 1 year ago

Per 8142872 and https://github.com/DawnbrandBots/yaml-yugi/actions/runs/5568150780/jobs/10170569310#step:6:7, we can add an automated workflow to check for missing Korean (and Japanese!) translations.

That would be great and helpful. Though, I still don't understand what scheduling issue creation is.

kevinlul commented 1 year ago

I created a workflow that identifies any prerelease cards that are missing placeholder IDs (fake passwords). Similarly, I could create one to identify missing translations. Taking it a step further, instead of just logging the cards that are missing content, the workflows could automatically create an issue for the missing cards and assign it to the appropriate people.

kevinlul commented 1 year ago

Will be using this issue to track work on the proposal: https://github.com/DawnbrandBots/yaml-yugi-ko#proposal

7ab822a4ab58a3be333fd05fe747858ea94dd37d 97441fe06e908809258e7c451f748aadcd23a4fd

kevinlul commented 1 year ago

TODO (me):

kevinlul commented 1 year ago

@Ice-Pendragon the old overrides.tsv will need to be replaced with the more comprehensive ocg-override.csv based on the work done in the Google Sheets.

Ice-Pendragon commented 1 year ago

@Ice-Pendragon the old overrides.tsv will need to be replaced with the more comprehensive ocg-override.csv based on the work done in the Google Sheets.

Got it. Ruby, Omitted Errata, and what else do you need?

kevinlul commented 1 year ago

You can look at the README of that repo to see how I've designed the files to be used, and if there are any problems with this approach.

kevinlul commented 1 year ago

You should be able to use the two Rush CSVs now to provide Korean translations for Bastion.

kevinlul commented 1 year ago

update yaml-yugi-ko to use alternate method to always scrape the entire official database

This is complete but needs some clean up to be committed this evening. After this, scrapes should be much faster and always update all card text.

kevinlul commented 1 year ago

@Ice-Pendragon I see you just did DawnbrandBots/yaml-yugi-ko@3cf7050e14036930367c1a56ed72670ab859a5f9, but actually, like half an hour before that, I just implemented the full scraper as promised, and it worked: DawnbrandBots/yaml-yugi-ko@581f0620cb5aff20c26db224d1b2a1ab036c5a0e

So actually, none of the YAML data files in that repository are needed now which kind of makes it csv-yugi-ko rather than yaml-yugi-ko!

Ice-Pendragon commented 1 year ago

@Ice-Pendragon I see you just did DawnbrandBots/yaml-yugi-ko@3cf7050, but actually, like half an hour before that, I just implemented the full scraper as promised, and it worked: DawnbrandBots/yaml-yugi-ko@581f062

So actually, none of the YAML data files in that repository are needed now ~which kind of makes it csv-yugi-ko rather than yaml-yugi-ko~!

Then would the YAML files rather be deleted? (after backup, if you need them)

kevinlul commented 1 year ago

I can remove them next week after we verify that the new code works well when a new pack is released. There's no need to explicitly back them up since they'll remain in the Git history of that repository.

kevinlul commented 1 year ago

I missed that the run there was actually triggered by you, and the automatic scheduled run only just happened. 😅

kevinlul commented 1 year ago

https://github.com/DawnbrandBots/yaml-yugi-ko/blob/master/ocg-override.csv is live DawnbrandBots/yaml-yugi-ko@705b9e1a08decaec7314ea75db23d3d772caf9cc ➡️ 410b5a3fd69060c4da7135e47da192af9840fead