Closed Yaoyx closed 1 year ago
I still can't figure out the problem you're trying to solve here. explore_model.ipynb runs fine for me. Hopefully this solves your problem, but I don't see the general need to add this option to basenji_train.py. So I'm going to leave this PR here for now.
Hi @davek44,
I see. My problem was that the output of basenji_data_write is in index format, which has a shape of [N, W, 1]), but the model requires one hot format input that has a shape of [N, W, 4] #173. In dataset.py, 'raw' is set to True for generate_parser function https://github.com/calico/basenji/blob/615b9eec8a591783b16d959029ddad08edae853d/basenji/dataset.py#L215, which skips the one hot encoding here https://github.com/calico/basenji/blob/615b9eec8a591783b16d959029ddad08edae853d/basenji/dataset.py#L89-L93. Thus, I was thinking if 'raw' should be False here, or exposing the 'raw' parameter to user, so we can decide it.
Regards, Yao
Description of your changes
I exposed 'raw' argument to CLI, so user can check whether the input data is already one-hot-encoded, which makes the explore_model.ipynb data loading consistent with the input shape and format required for a model prediction.
Issue ticket number and link
173
Type of change
(If applicable) How has this been tested?