Open rmattson1008 opened 1 year ago
Seems like the easiest way to do this without messing with any dqn procedures is to write embeddings straight to a file in the forward loop. Would need a state variable to toggle whether or not I wish to write. (Only turn on for the target dqn, when passing probe set through model)
A note on probe set - will need to pass each train through the probe set to get "final" answer. (So that we are not working with ground truth. Unless I can get optimal action from gym?) I don't think thats the point of RL
Hooks are live, but batching is unfamiliar. Check.
You will need to be very organized here, or you will spend all week backtracking. The hooks are pretty straightforward (a dict with a key for each hidden layer). Saving is where I tend to get messy.
Make sure that you keep the probe set separate from the dqn set. For DQN, we have a train/val split. For the separate probe set, we have a train and test split (or possibly some cross validation?) The training of the probes is less important as long as the splits are maintained and everything is kept very consistent.