Open ren- opened 7 years ago
Never really went anywhere. I still think it's something with huge potential, but I don't currently have the time for it.
I'm in the same boat, wanted to start something like this, googled and found your repo.
Heh, nice.
As I see it, there are a few issues to consider and make decisions on:
Where to get the training data from. Scrape poe.trade? Bully some of the other indexers around for their APIs? Or parse the GGG river ourselves (who would provide the resources for that)?
How to acquire training data with as little bias as possible. We're faced with the paradox that only sold items are really relevant for pricing, but we don't know which ones those are. At best, we only learn about items entering and leaving stash tabs.
How to fight price manipulation. Kind of ties into the above. While it's far harder to manipulate "consensus" prices for rares than for uniques, we still need to prevent people repricing their own items 1000 times to sway our learning. Realistically, we won't be able to fight everything though.
I assume that we would build an app (or include our results in one of the pricing apps), but that raises more questions:
And then there's of course the actual "designing and training the network" part. But that's the fun part :P
I think we'll have a good approach by going based on affix values (+ maybe some info on base types, links/colors and implicits) - this decoupling from individual items should harden it against manipulation quite a bit. Judging convergence will be harder, but I think we'd first need some data to play with. I was already playing around with trackpete's exiletools API at one point to , but him closing the API kind of stopped this project dead in the tracks.
We could start small by using a single dedicated server for scraping/parsing.
At first, stalking top X players and their trades could introduce less bias than blindly looking at ALL the listings, at least I'd like to believe so :)
Yeah, I agree, this one is impossible to deal with as like you said, we have no vision on the "sold" items vs removed.
I tend to believe there won't be any high fluctations of rare item prices. Even when HOWA ES gear with + int increased in price tenfold, there were so many listings that the network should potentially relearn it in a timely manner.
Yet again, I'm a newbie when it comes to ML, but I'm good at other areas.
Let me know what other issues you see with this.
We could start small by using a single dedicated server for scraping/parsing.
So I just looked into Azure prices, and it seems the CPU hours are by far the most expensive part (surprisingly, inbound bandwidth is completely free!). An A2 machine (2 cores, 3.5 GB RAM, 60 GB disk) clocks in at 40 £/50 €/USD 60 per month (assuming it's running constantly). Additional disk space is cheap in comparison, but I'd figure that we'd likely need at least 2 cores (one for river parsing and one for training). It's not an unthinkable investment for me, but I wouldn't do it without a credible perspective for some result (and I can't really invest much time into this for at least another month). And if this project does end up going somewhere, that investment might also have to go up significantly (more training, some form of serving the training data to users). Again, nothing impossible, but it's definitely above what my student self would spend without a second thought. Note: While I'd have an idle machine available atm, I wouldn't really want to offload the electricity cost on my family while also eating a constant 1 MB/s from their connection.
We should probably ask around what workloads others have encountered (@trackpete?).
At first, stalking top X players and their trades could introduce less bias than blindly looking at ALL the listings, at least I'd like to believe so :)
Could be, but on the other hand this also makes it easier for "malicious" ones among them to sway the system by giving them more influence overall.
I tend to believe there won't be any high fluctations of rare item prices. Even when HOWA ES gear with + int increased in price tenfold, there were so many listings that the network should potentially relearn it in a timely manner.
I wasn't concerned with the learning speed so much as with how users of this pricing facility will get the new trained networks. Would the pricing tool just periodically pull updated networks from the training server (see above)?
There are also funny game-theoretical considerations, most of which will never become reality but are still worth entertaining:
If everybody used this tool, what would happen to the economy? Would it tend more/faster towards perfect supply/demand? Would it be easier or harder to manipulate? What would happen if all rares were priced at the same value - would it move away from this "equilibrium" (maybe there needs to be some slight noise added to the estimate for this to happen)? How much of all these effects does remain when only x% of players use the tool?
Thinking about this, it's way easier for the tool to judge if an item is priced competitively when running on the user's computer (just read the log for PMs about items). Maybe even allowing users to tune e.g. how fast they want their items to sell (i.e. undercutting/increasing the estimated price based on how fast the items are selling).
I think this has the potential for massive impact on the PoE economy - imagine if Acquisition automatically priced all your items for you! (Right now it doesn't even have a facility to automatically lower buyouts over time as far as I'm aware). Naturally there would be resistance against this ("I won't trust my items to some evil robot mind!"), although players do have the last word in every case (i.e. declining trades). I don't know, it would be interesting to see.
Based on your commenting times I feel like you're in some US time zone?
Hi again,
I'm going to sleep now, so a short reply before I do.
I have an i7 4790K dedicated server (32 GB of RAM, SSD, 1 Gbps network link) that I am willing to share for free.
I'm in EU timezone: GMT+2
How about you? EDIT: I just saw you are from UK, sorry :)
Actually I'm from Germany, studying in the UK atm though. :)
That's awesome that you have the server available!
The next things to consider could be:
How to preprocess the item data coming from GGG (there are a bunch of implementations around, looking through https://reddit.com/r/pathofexiledev should yield some). That includes turning the JSON into something less bloated (i.e. stripping irrelevant information like flavour texts, only storing information relevant for us like affixes and maybe socket data), but also filtering unrealistic prices (or at least allowing that to happen later on).
How to setup the network(s):
How do we handle different item classes (e.g. gloves/flasks/wands) and bases within them (e.g. gold ring/coral ring)? I reckon we'd use different networks for different classes, but the bases should probably be shared within the network. For some classes it may make sense to encode the base as a range (e.g. 2H axe progression), but that won't work for e.g. amulets.
Whether to work with the simple displayed numeric ranges of the mods or whether to decompose them into the actual affixes. A sufficiently complex neural network should be able to figure out any price variations in cases where this becomes relevant (i.e. "can I craft %phys?") anyway though, so I'd vote for just using the displayed values. Keeps our lives simple and likely has almost no impact in the general case.
What item information beyond affixes is worth including. The item filter conditions or just the poe.trade input form may help assemble these.
Which frameworks/libraries to use for all the above steps. I have a preference for writing code in C# (interfacing other libraries, including in other languages, shouldn't be very hard). I see you're quite big on Go (which I've never touched, but I'm open to new stuff). I don't have much experience regarding different NN libraries and also can't really judge different database systems (that's assuming we're storing sold items for at least some time for training purposes) - some simple PostgreSQL setups is all I have to my name in that direction. These are all pretty big decisions...
Ok, there's probably a whole bunch of stuff in here that's really not needed for the first few iterations, but I'm trying to also keep the big picture in mind at this stage (especially in order to guide the choices regarding the last point).
Lastly, I suppose the goal would be to have this pricing included in the PoE-TradeMacro (see here) and/or Acquisition. Maybe it's worth asking if they'd be interested in it first though.
Sorry for the late reply, but I feel rude leaving a short reply after your well thought answers :)
Where could I contact you to chat about this some more? Discord? Hangouts?
I've got Skype (max.langhof) and am active on reddit (MauranKilom).
Xmpp? Any irc servers you visit? My opinion on skype would be over the character limit, so I'd rather keep it to myself.
On Mon, 20 Mar 2017 at 17:28, Max Langhof notifications@github.com wrote:
I've got Skype (max.langhof) and am active on reddit (MauranKilom).
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MLanghof/NeuralTrade/issues/1#issuecomment-287795117, or mute the thread https://github.com/notifications/unsubscribe-auth/AGtoh2QChdDZUI465JprSu5P0lNY6Pqqks5rnpsDgaJpZM4MfGZA .
I'm also on Discord (MauranKilom#7673) but I don't tend to hang out there much at all. We could talk there though. And yeah, skype really isn't the pinnacle of IM, but you know.
Thanks,
I'll add you on discord this evening.
On Mon, 20 Mar 2017 at 17:57, Max Langhof notifications@github.com wrote:
I'm also on Discord (MauranKilom#7673) but I don't tend to hang out there.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MLanghof/NeuralTrade/issues/1#issuecomment-287805087, or mute the thread https://github.com/notifications/unsubscribe-auth/AGtohxAzvwtMhLQWhDV26X_OEMKptryeks5rnqHjgaJpZM4MfGZA .
I received this reddit pm from /u/DaftWTPlayer:
Hey!
seen your git repo and the issue #1 that you are having.
Working on something similar.
I ignored most problems relating to the items being sold vs. placed in the stash and just scraped the POE API for the items in stashes.
Took around 6 hours to collect ~3.5GB of data and ~5M item mentions that are within the league and have price associated with them.
You really don't need a server for the scraping - do it at home on an external usb drive. Main difficulty is storage, of which you'll need ~300GB for 1-2 weeks of the league.
Converting API to vectorized form is straightforward as well as GGG provides lists of all item bases.
The rest is just training the network and optimising your scraper to try to get as close as you can to "real" trades.
My network is training :P So far it learned that MOST of the items are traded at 1-3cs range :)
I am happy to discuss it in more detail. After a few hours of training a tanh-relu-out network it still appears to be pretty bad.
Equally a classifier network was not doing that well either. I really suspect the main issue comes from item placement price vs. purchase price. The price fixing/manipulation does not even come into question here at this stage, as the output can vary wildly.
If you have any tips on how I can track down which items were removed from stashes, rather than placed there I would be very grateful :P
I guess that's encouraging, and it's good to know more people are working on/interested in it - maybe you would like to get in touch with him. I'll potentially find time for this over easter, but I still have a host of other things to deal with first.
Hi,
How did it go with this project?