Closed alecokas closed 4 years ago
dataloaders
into dataloaders/birds
, dataloaders/flowers
, dataloaders/xrays
and then you can have loaders.py
and helper_fxns.py
inside each? But might be overkill. Alternatively, make them a bit more generic if possible and then leave them in this file. But in all honesty, how they currently are is also fine!This can now be merged if you are happy with it @Devin-Taylor, using one-hot encoding for categorical variables. Keep the branch open as I'll actually work on the embedding network here
Comments required on two points please:
I wasn't really sure where these functions should sit, can you think of anything better than
utils/data_helpers.py
? at the moment I run then like this:Can you see any problem with the way that I have gone about assigning different categories to numerical integers? I'm worried that by doing this, I've implicitly made certain categories more related to each other than others. For instance, if I have three categories
['X positive, X negative, no record]
, and I assign these to the categoricals[0, 1, 2]
. Is building in thatX positive
is less similar tono record
thanX positive
is toX negative
an issue? Obviously we will train some embedding on top of this, but maybe the initialization is dangerous?Probably don't merge yet @Devin-Taylor