evjrob / bayes-bet

NHL hockey model and predictions
http://bayesbet.io
MIT License
7 stars 1 forks source link

prediction data #28

Open maaaaz28 opened 11 months ago

maaaaz28 commented 11 months ago

hey! how you're getting prediction data from dynamo db and also is there any way of getting prediction data locally and also can you describe the last_pred function because code crashed after that

evjrob commented 11 months ago

I’m afraid I don’t have enough context to understand your questions.

if you don’t mind, could please tell me:

  1. The exact command you’re running that raises the error
  2. The content of that error message
  3. How/where you’re executing this. Just locally on your own computer? Or in a cloud environment you set up?
maaaaz28 commented 11 months ago

what im actually saying is I'm running it locally, I don't have access to a dynamo db instance so I'm trying to move it to a database like SQLite. I am trying to run the main.py file inside the model/nhl folder with a few changes. Since i dont have dynamo db connected or any data locally this line (line 61) returns None last_pred = most_recent_dynamodb_item('nhl', today) (link is to that line) https://github.com/evjrob/bayes-bet/blob/2ef6d032b3acf83ea47c0f01171c22680ed0fae0/model/bayesbet/nhl/main.py#L61 Which in turn breaks the next line and so the app crashes. I'm sure I'm doing smth wrong but idk what at this point. Any help would be appreciated thank you 🙂 If you could at least provide me the schema for dyanmo db or even a dump for testing

maaaaz28 commented 11 months ago

"(line 61) returns None last_pred = most_recent_dynamodb_item('nhl', today)" (the line is in bayes-bet-master/model/bayesbet/nhl/main.py)

evjrob commented 11 months ago

Ok, that makes sense. The DynamoDB instance I host won’t be accessible from outside my AWS environment. Also this main module is actually a bit old — a legacy from when I used to run this model in AWS Batch. I actually intended to delete it and just haven’t done it yet.

What you really want are my old evaluation scripts. Unfortunately these workflows were uncommitted and got lost in a hard drive failure. I’m partway through creating new workflows with better tools for tracking model performance that will integrate directly with this GitHub repo for everyone to see. If you’re okay waiting on me to finish those, I hope to have them done within a few weeks.

Otherwise, there’s an example of what a single DynamoDB record looks like here: infrastructure/terraform/assets/db_item.json

It wouldn’t be too hard to write a model execution script that stitches together the essential steps without DynamoDB. You could just maintain a list of these dictionary objects as you iterate over the game days.

maaaaz28 commented 10 months ago

hey! The problem im facing regarding the prediction data is that the json file you mentioned is just a static prediction i suppose and, can you tell how and where these predictions are being saved? Because when we run it locally predictions are being shown none or empty, i tried storing it in another db as well but it showed none, can you tell how we’re suppose to save predictions locally?

evjrob commented 10 months ago

They’re being saved in DynamoDB. They’re really just JSON and can be easily represented as python dictionaries. If you wanted to, you should just be able to save them to disk as JSON.

Do you mind providing the code that’s producing these None/empty predictions? It’s very hard to know what’s going on without it.

maaaaz28 commented 10 months ago

this is dynamodb version of code in db.py

https://github.com/evjrob/bayes-bet/blob/2ef6d032b3acf83ea47c0f01171c22680ed0fae0/model/bayesbet/nhl/db.py#L23C12-L23C12

so i replaced it with a non-dynamodb version

with open("db_item.json", 'r') as b:
    api_data = json.load(b)
query = table.where('League', '==', hash_key).where('PredictionDate', '<=', date).get()
firestore_response = query

# Combine data from the file and Firestore response
response = {
    'Items': [api_data] + [doc.to_dict() for doc in firestore_response]
}
item_count = len(response['Items'])
if item_count > 0:
    most_recent_item = response['Items'][0]
else:
    most_recent_item = None

return most_recent_item

but it says "index is out of range" when I try getting the data from the db_item.json file but when its not the file then it crashes inside the recent_dynamo function with saying non/empty on "items" 
when I trace the main, it shows last_pred as empty 
maaaaz28 commented 10 months ago

all i want is to get pymc3 model work locally, your help will be appreciated a lot

evjrob commented 10 months ago

Ok, got it. I’ll take a closer look later and see if there are any easy things to get you unstuck.

Worst case, you’ll be able to use my new model evaluation pipeline when I’m done. I’m still working on this and I don’t have a ton of time right now to finish it. It will probably be merged sometime in the new year.