Closed dan-ryan closed 7 years ago
No, I don't think you should do that, in my mind you have two options:
Thanks for the answer. I'm predicting a batsman score in a cricket game. So I feel ids would make it more accurate (but yet to do the testing). I'm using LSTM. The data I'm working with is (ball by ball data):
batter id, bowler id, opposite batter id, inning order, over order, points count, batting team points, is legal delivery, legal delivery order, first team id, second team id, total batter points.
I'm actually currently dividing it with the highest id in the training data. But if I want to make this more practical, I won't know how high the ids will be as new players are created all the time. Would using max int be an issue (2,147,483,647)? This is a lot less than Number.MAX_SAFE_INTEGER
.
Oke, I see, your id's are of importance for sure then. I'm not sure if choosing a high maximum id has an effect on the learning capability of your network (due to small numbers), but I suppose it will have some effect.
You should choose a maximum number that you know will never be exceeded, but is not too large. For example, you know there will never be more than 2,147,483,647
teams/players. I'm not really into cricket, but I assume there a no more than 1000 teams and 20000 players (at least, of which you have the (future) training data).
Thanks for the advice. I'll choose a number that I believe won't be possible to hit. If it gets close then do warnings in the code which someone can fix later.
I'm new to neural networks so I'm wondering if I have the right idea. My training data has id's which is a 32bit number. So to normalise the data I'm wondering if I should do:
var normalisedId = id/Number.MAX_SAFE_INTEGER