mtgjson / mtgjson3

MTGJSON repository for Magic Cards
http://mtgjson.com
Other
545 stars 102 forks source link

Automated updates ? #306

Closed ekendra-nz closed 6 years ago

ekendra-nz commented 7 years ago

I'm just wondering if the whole system could be automated so that the site mtgjson.com always has the newest set info as soon as the data is available on gatherer.wizards.com ?

I'm not familiar enough with the code base to see what it would take to make it happen. I'm also short on time to dedicate to it.

I'm just wondering how feasible it would be.

eddie-g commented 7 years ago

I don't know why it would be difficult at all. It's been almost 3 week and no MM17 set. I would be willing to help, as it stands now mtgjson updates too slow to be relied upon.

Garbee commented 7 years ago

The system currently is a static scraper that prebuilds static files to serve. Complete automation could lead to complete failure, which means no new output, if Wizard's changing something in the way we scrape or if they add a new mechanic or some other absurdity.

This is something I've been looking at but it involves an absolute rewrite so MTGJson is no longer a static file generator but stores data in a database. Which then we can cache the JSON output from for use. That will allow for a far more lenient pulling system where if things fail, the runs will still complete fine and we can fix the individual issues later. Further, this would give us an absolutely unique internal identifier to reference cards by instead of the current system which depends on data that is capable of being changed.

The current codebase is overly-complex and is written fairly poorly imho. This is because it was not built with these kind of requirements in mind. It works, but only just. So, in doing a rewrite, I'm literally starting from scratch on a project to do it. Right now it isn't public since I only have some bare basic bits working and it's been a few months since I've touched it. Once I get it where full sets can be scraped with all types of cards (except for quirks like flip/split cards) I'll open the repo up for some feedback on it.

ekendra-nz commented 7 years ago

@Garbee : Hey! It seems you and I are thinking along the same lines. A database app seems like the way to go. It would be better to just keep the rich dataset and scrape for new cards as they are released.

As far as anomalies (flip cards, multiple prints w/ same multiverseid) are concerned, that's where the community could pitch in and help.

PAK90 commented 7 years ago

A while back a few of us created a fork of mtgjson called mtgsqlive, which was meant to be a live database that you could generate allsets-x.json from (would also enable spoilers to be added in as soon as they were released). I feel like this would be useful for any automated attempts.

ekendra-nz commented 7 years ago

In a perfect world, all of this data would be collaboratively pruned and made available via an API for all, much like this was attempting to be: http://deckbrew.com/api/

lsmoura commented 7 years ago

MM17 is live since yesterday, just FYI. Also, what @Garbee said about automation is pretty much all there is to say...

csuhta commented 7 years ago

In a perfect world, all of this data would be collaboratively pruned and made available via an API

The Scryfall team makes data from MTGJSON (thank you!) and other sources available via our API, and we update during spoiler season. Please check it out if you want a hand-curated source of card information that is downstream from MTGJSON:

https://scryfall.com/docs/api-overview https://scryfall.com/docs/api-methods

lsmoura commented 7 years ago

@csuhta What do you use for backend? php? node? ruby?

csuhta commented 7 years ago

PostgreSQL and Ruby

ekendra-nz commented 7 years ago

@csuhta Mind blown. Happy days. Can't wait until I can get some time to play around with your API.

tooomm commented 7 years ago

Interesting, keep us updated @Garbee ! Anyway, I hope you are a bit more flexible and free again soon @lsmoura 👍

ZeldaZach commented 6 years ago

V4 will not have the functionality to automatically run, but it should be simpler to build the system