Closed HilbertHuangHitomi closed 3 years ago
cc: @felixp8 -- what's the difference b/n the files that you handed off vs the ones that the latest nlb uses?
I don't believe I've changed the file format since then. It looks like this function expects the keys to be 'train_data_heldin'
, etc., though nlb_tools
uses 'train_spikes_heldin'
and so on. Did you change those manually at some point @joel99?
Here's my working procedure.
modify XXX_data_XXXX
to XXX_spikes_XXXX
in src/dataset.py
.
if 'eval_spikes_heldin' in h5dict: # NLB data
get_key = lambda key: h5dict[key].astype(np.float32)
train_data = get_key('train_spikes_heldin')
train_data_fp = get_key('train_spikes_heldin_forward')
train_data_heldout_fp = get_key('train_spikes_heldout_forward')
train_data_all_fp = np.concatenate([train_data_fp, train_data_heldout_fp], -1)
valid_data = get_key('eval_spikes_heldin')
train_data_heldout = get_key('train_spikes_heldout')
if 'eval_spikes_heldout' in h5dict:
valid_data_heldout = get_key('eval_spikes_heldout')
else:
valid_data_heldout = np.zeros((valid_data.shape[0], valid_data.shape[1], train_data_heldout.shape[2]), dtype=np.float32)
if 'eval_spikes_heldin_forward' in h5dict:
valid_data_fp = get_key('eval_spikes_heldin_forward')
valid_data_heldout_fp = get_key('eval_spikes_heldout_forward')
valid_data_all_fp = np.concatenate([valid_data_fp, valid_data_heldout_fp], -1)
else:
valid_data_all_fp = np.zeros(
(valid_data.shape[0], train_data_fp.shape[1], valid_data.shape[2] + valid_data_heldout.shape[2]), dtype=np.float32
)
# NLB data does not have ground truth rates
if mode == DATASET_MODES.train:
return train_data, None, train_data_heldout, train_data_all_fp
elif mode == DATASET_MODES.val:
return valid_data, None, valid_data_heldout, valid_data_all_fp
nlb_tools
read nwb data and save as h5 with something like:
train_dict = make_train_input_tensors(
dataset,
dataset_name = 'mc_maze_small',
trial_split = 'train',
include_behavior = True,
include_forward_pred = True,
)
eval_dict = make_eval_input_tensors(
dataset,
dataset_name = 'mc_maze_small',
trial_split = 'val',
)
data_dict = {
'eval_spikes_heldin' : eval_dict['eval_spikes_heldin'],
'eval_spikes_heldout' : eval_dict['eval_spikes_heldout'],
'train_spikes_heldin' : train_dict['train_spikes_heldin'],
'train_spikes_heldout' : train_dict['train_spikes_heldout'],
'train_behavior' : train_dict['train_behavior'],
'train_spikes_heldin_forward' : train_dict['train_spikes_heldin_forward'],
'train_spikes_heldout_forward' : train_dict['train_spikes_heldout_forward'],
}
save_to_h5(data_dict, os.path.join('./data/mc_maze_small.h5'))
./config./mc_maze_small.yaml
DATA:
DATAPATH: "./data"
TRAIN_FILENAME: 'mc_maze_small.h5'
VAL_FILENAME: 'mc_maze_small.h5'
However, I got the following issue:
removing ./Results/logs/mc_maze_small
2021-10-14 09:18:01,907 Using 1 GPUs
2021-10-14 09:18:01,946 Using cuda:1
2021-10-14 09:18:01,946 Loading mc_maze_small.h5 in train
2021-10-14 09:18:02,155 Clipping all spikes to 7.
2021-10-14 09:18:02,155 Training on 75 samples.
2021-10-14 09:18:02,156 Loading mc_maze_small.h5 in val
2021-10-14 09:18:10,835 number of trainable parameters: 682538
0%| | 0/50501 [00:00<?, ?it/s]/opt/conda/conda-bld/pytorch_1587428091666/work/torch/csrc/utils/python_arg_parser.cpp:756: UserWarning: This overload of add_ is deprecated:
add_(Number alpha, Tensor other)
Consider using one of the following signatures instead:
add_(Tensor other, *, Number alpha)
0%| | 0/50501 [00:01<?, ?it/s]
Traceback (most recent call last):
File "src/run.py", line 144, in <module>
main()
File "src/run.py", line 58, in main
run_exp(**vars(args))
File "src/run.py", line 137, in run_exp
runner.train()
File "/home/username/Projects/neural-data-transformers/src/runner.py", line 341, in train
metrics = self.train_epoch()
File "/home/username/Projects/neural-data-transformers/src/runner.py", line 482, in train_epoch
eval_r2 = self.neuron_r2(rates, pred_rates)
File "/home/username/Projects/neural-data-transformers/src/runner.py", line 749, in neuron_r2
gt, pred = self._clean_rates(gt, pred, **kwargs)
File "/home/username/Projects/neural-data-transformers/src/runner.py", line 737, in _clean_rates
raise Exception(f"Incompatible r2 sizes, GT: {gt.size()}, Pred: {pred.size()}")
Exception: Incompatible r2 sizes, GT: torch.Size([25, 35, 107]), Pred: torch.Size([25, 45, 142])
runner
I commented
#eval_r2 = self.neuron_r2(rates, pred_rates)
#metrics_dict['eval_r2'] = eval_r2
Now it seems to run smoothly.
I have successfully followed nlb_tools to read NLB datasets successfully, but I noticed NDT needs h5 files which are not the same as my h5 saves. How should I prepare the dataset I downloaded from DANDI for running NDT, please?