Probable "look-ahead" by using StandardScaler on total test data

Hi Shuai,

thanks for your time preparing and sharing this nice example of RL applied to finance. I'm learning a lot by exploring the code.

I've noticed that during data pre-processing you normalize the data using StandardScaler. Two questions come to my mind:

Are you aware that by creating a StandardScaler based on min-max over all the test data you are introducing "look-ahead", meaning your model will include information that in production mode would not be available to you?
It feels quite strange that you normalize the number of stocks owned by some "magic number" as you write in the code. Is there any rationale behind this number?

Thanks a lot!

jayinai / teach-machine-to-trade