SalvationDevelopment / YGOSalvation-Server

Server, Launcher, build test, and management system for YGOPro and related technologies.
http://ygosalvation.com
Other
23 stars 18 forks source link

card DB connector #53

Closed Zayelion closed 9 years ago

Zayelion commented 10 years ago

Create an JS function that connects to server/http/ygopro/card.db within the file structure from server/libs/carddb-update.js. (create the new file) Have the file print to console.log the id of each field in the table one at a time in an asynchronous manner.

Zayelion commented 10 years ago

The issues addressed here are about asynchronous. We want the server to process ONE page at a time fully and save it out before moving to the next one. If it prints all of them at once the server will open 7000+ connections and then crash/stop the server while it is processing that.

So this will require a code review to make sure it is properly asynchronous.

adrianojn commented 9 years ago

any update on this?

Zayelion commented 9 years ago

https://github.com/SalvationDevelopment/YGOPro-Support-System/blob/master/server/libs/carddb-update.js @me-stefan-baiu wrote most of this but due to school and stuff he stopped. There are a few parsing issues, if I remember correctly. I'm hoping he finishes it by the end of the year, that would be great. He sorta dropped off the earth. We are still doing it by hand atm.

adrianojn commented 9 years ago

Wikia JSON dump: result.zip

Zayelion commented 9 years ago

Well, thats the first step next is marshaling it into the cards.cdb for each language. That's the part @me-stefan-baiu was working on, understanding how YGOPro deals with each piece of data, tokens, alternate card arts, and so forth. Then unwinding the JSON data correcting inconsistencies and then updating the card file once he had verified that the logic was correct. This needs to be done over all the languages and into separate files.

adrianojn commented 9 years ago

translatedb

Known limitations:

Zayelion commented 9 years ago

I "think" this is good enough for the moment, but feel free to keep tweaking it to over come the alternative artworks limitation. An output of the name only translations to pass off to the "YGOrganization" or the general public would be helpful to fill in the missing data via interested parties; I'm sure it's extensive.

Cards without official ID's not being translated is fine. This keeps the card release for Salvation under DevPro but above or with DuelingNetwork which I am OK with.

I'll help out more with it in the future because I have to learn Go for work. There is still some manual work for updates that has to be done but I think it's fine.

adrianojn commented 9 years ago

An output of the name only translations to pass off to the "YGOrganization" or the general public would be helpful to fill in the missing data via interested parties

The command prints the ID and name of the cards, you can filter the output with:

translatedb -db cards-es.cdb -lang es | grep incomplete > log.txt

I think that GitHub shell includes the grep command.

Zayelion commented 9 years ago

After testing and reading over https://github.com/adrianojn/translatedb/blob/master/translate.go dbUpdate(id, name, lore string) needs some additional(or less) logic to update partials. For figuring out alts do id--; except on tokens, I forget the logic used for those at the moment. Asian languages do not parse do to the ruby markers but thats ok at the moment.

adrianojn commented 9 years ago

Asian languages do not parse do to the ruby markers

Done

For figuring out alts do id--;

Is not enough, some cards have more than 2 artworks. Should I consider consecutive IDs as same artwork? Which should be the limit?

This query works:

UPDATE texts SET name=?, desc=? WHERE id>=? AND id <(? + 9);

But 9 may be too much and overlap with a real card.

Zayelion commented 9 years ago

If it fails look up by name?

adrianojn commented 9 years ago

Binaries at http://gobuild.io/github.com/adrianojn/translatedb (v1.3)

Compatible with 10 artworks and never overlap with real cards.

Zayelion commented 9 years ago

That is the most glorious website I have ever seen. Right after github.

adrianojn commented 9 years ago

what is still missing?

Zayelion commented 9 years ago

It skipped cards like Giant Solider of Stone during testing for es. Looking over the code when it gets incomplete data it just dumps it. Also in the code that generates the wikia.json is locked to the en wikia. The es wikia has far more data, and I think they are other wikias also.

adrianojn commented 9 years ago

Giant Soldier of Stone doesn't have a es translation (lore). Should I change to update partials?

The others wikias are very poorly. They don't handle redirects and uses custom tags like: descripción (es_lore on English wikia), Beschreibung (de_lore).

Zayelion commented 9 years ago

Yes you should update partials. You could create a configuration file and split the work out for the custom tag mapping via a JSON file. We want to pull as much information down as possible.

adrianojn commented 9 years ago

The biggest problem is the redirect system. The correct way to get data from wikia is by card title, not by card id. The english wikia redirects the card id to the correct page but the others don't.

I will remove the "updating" log message, the program will log only the incomplete cards

Zayelion commented 9 years ago

What about running a different query that dumps legal cards for OCG and TCG into a JSON file and then loading that JSON file into a into an in memory NoSQL DB and doing the processing with slightly different ID based queries?

adrianojn commented 9 years ago

Sometimes I think you don't evaluate the system complexity.

running a different query that dumps legal cards for OCG and TCG

This requires a better knowledge of wikia api.

doing the processing with slightly different ID based queries

This is about the tranlatedb or wikia command?

Zayelion commented 9 years ago

Sometimes I think you don't evaluate the system complexity.

Yeah this is likely true, but it looks like an unmaintained mess than intentional complexity. I keep making alot of assumptions about the wikia's functionality, looking at it again I see your point.

I was thinking if you could get a JSON download from an alternative wikia with the wikia command (using an unspecific method) you could use NoSQL to map it into the card DB.

Zayelion commented 9 years ago

The English one works fine btw. And sorry, its not my intention to be critical or frustrate you. You are a huge help!

adrianojn commented 9 years ago

To download from other wikia only this line needs to be changed to something like http://es.yugioh.wikia.com/api.php, but still is wikia responsibility to do the id -> name mapping.

To parse the language-specific JSON only this line needs to be changed (use en as fake language).

The difficulty is really the wikia api, being title-based instead of id-based was a poor design decision. I'll try to get the pages by OCG&TCG lists but it looks very complicated.

adrianojn commented 9 years ago

I found a way to list all TCG an OCG pages, cmcontinue is used to paginate.

For now I identified this work flow:

  1. download the list of pages by category (pages-es.json)
  2. download the pages (cards-es.json)
  3. save on database (cards-es.cdb)

I think that the wikia and translatedb will merge.

Should the specific wikias be the only source? If no, which should have preference?

Zayelion commented 9 years ago

Yes, that keeps things simple I feel.

adrianojn commented 9 years ago

I noticed one bug on traslatedb: pendulum text is ignored. Which format should be used to join card text and pendulum text?

Zayelion commented 9 years ago

'[Pendulum Effect: ]' + pendulum_text + '\r\n[Monster Effect: ]' + Lore

adrianojn commented 9 years ago

Done (v1.2)

Only es, fr and de wikias are supported. Italian wiki is bad formatted and the others are plain text.

Zayelion commented 9 years ago

https://github.com/me-stefan-baiu/YGOPro-Support-System/commit/46f58013f2dd5e3831309ff6c1d5eaca0a29cb4d

Stefan is working on something again.