elcolumbio / mlrepricer

Explore pricing data. Share insights and models. Build environment for repricer.
Other
9 stars 3 forks source link

threading #5

Open elcolumbio opened 6 years ago

elcolumbio commented 6 years ago

This is my first time using threading. So lot's of space to improve.

I try to set up what seems plausible for me.

elcolumbio commented 6 years ago

Also concurrency with asynco sounds very interesting. It's a really interesting new problem field to tackle.

Bobspadger commented 6 years ago

I've got something like this working in production now, I have 4000+ products using this. I have not seen a need for threading / async yet. You may just over-complicate the system needlessly.

elcolumbio commented 6 years ago

Yeah i don't know what i am doing yet :).

I just found out it's quite convenient to use threads from a jupyter notebook. You don't block your notebook. That's the only use case right now. Later i can maybe use it to monitor threads.

I also don't plan to do parallelism ever.

Right now it's 5 lines of boilerplate you can ignore and call direct the function.

elcolumbio commented 6 years ago

@Bobspadger Are you allowed to share code?

I still miss the complete price updating part for example.

Bobspadger commented 6 years ago

I'm not sure - it drives our Amazon strategy so don't know how much of it I can share.

The overview is really that we wait for an SQS message , analyse the contents, and decide to either raise or lower our price based on if we have the buy box.

My products team set min / max prices for a product, then I operate within those windows to get the buy box.

Bobspadger commented 6 years ago

What I'd be doing (and its kind of how I have it running at present) is I have a redis db of unique ID's actioned, to make sure we've not already processed the message, and one of SKU codes and last actions to make sure we're not acting on an older message.

The SQS message can be out of order and duplicated so the Redis system works really well.

Then, If I needed more throughput, I'd just start another instance of the app, as the safeguards in the redis db stop me from processing duplicate or old messages

elcolumbio commented 6 years ago

Thank you. Will look into redis 5.0 right away. I tried to store everything i get from the amazon apis in sql. But they use NoSQL themselves?

It's ok i appreciate your comments. Personally i believe a good strategy is to contribute and leverage the open source stack as strong foundation. Pricing is a commodity, the marketplaces offer for free.

elcolumbio commented 6 years ago

I have found what i think is a good redis structure? Sorted sets for each asin.

zadd asin1 1 message zadd asin2 2 message zadd aisn3 3 message

where the format is key: asin score: timestamp value: message

then i can do this: r.zrevrangebyscore('asin1', "inf", "-inf", start=0, num=1) one guy said it's fast: "It's O(log(N)), so for 1 billion entries it's about 20 comparisons"

you store the message in xml format or parse it first?

If i dont want to parse every asin i need a set where i put in names of asins. And a worker can pop the set.