sselph / scraper

A scraper for EmulationStation written in Go using hashing
MIT License
449 stars 88 forks source link

Plans for bulk API support? #253

Open J-Swift opened 5 years ago

J-Swift commented 5 years ago

I just wanted to get some input on if you had given bulk API support any thought. As mentioned in the gamesdb v2 update PR, I would like to add support to the gamesdb source for bulk API operations. It is essentially required for that source due to new API restrictions that have been put in place. I don't know enough about the other data sources to say if its nice-to-have or definitely-should-support.

I've looked through the current code to see how amenable it might be to migrating to bulk operations and have some ideas on paths forward, but I just wanted to touch base with you to see if you had given it thought and/or had plans for it.

I noticed that the project may have fallen off your radar (completely ok and understandable!), but I do appreciate all the work you've put into this upto this point and hope to see the project live on! Happy to assist as I am able.

sselph commented 5 years ago

I have thought about it but the way the project grew it wasn't going to be super easy. I originally had 1 source and kinda grew from there so it operated on a single game at a time and the fallback method for a different source was per game. Eventually several sources added bulk operations but I never got around to restructuring the code. I remember starting along the path but don't think I ever finished it. With out a lot of thought, I'd think maybe batching the unscraped games and sending them to source #1 then sorting the results into a done and unscraped list. then forming batches on the remaining unscraped games and sending them to #2, etc.

If you have ideas and time to implement them feel free to send a pull request and I'll make time to review it.

J-Swift commented 5 years ago

Definitely understand the current code being the result of natural progression. Also, I agree with the batch-per-source idea.

I'm between jobs at the moment so I've got time to put into this, I'll go ahead and put something together for you to check out. My current hangup in thinking through it is how to rework things so that Rom doesn't control the work. I guess we can introduce something like Romset and then embed Rom in that, I haven't looked too far into it to see.

Also as part of this, what do you think of changing DS interface from

GetGame(context.Context, string) (*Game, error)

to

GetGames(context.Context, []string) [](*Game, error)

(I understand there is no tuple in golang, just imagine that this is a collation of all the results)

The return type is up for discussion, but the key point being that we assume batch operation by default to keep things flexible for potential new datasources. If something doesn't support batch api calls they can just sequentially request them and the DS consumer is none-the-wiser.

sselph commented 5 years ago

Yeah this was also my first Go project so it is a bit of a mess. That seems reasonable but at the moment you are probably more familiar with the code than me.