create_embedding_dataset.py creates the embeddings
embeddings.yaml is the config
you can specify embedding/module/embedding inn config to select the module that is supposed to produce the embeddings.
The model needs to return the embeddings for the data feed into it, when calling it.
Interface is provided in dataset/embeddings/embedding_backend_model.py/BaseEmbedModel
acceleration of chirp-models was not possible, as tf.function decorator did not work with call to chirp library
datamodule is changed as follows:
dataset has flag "mode" (Literal["local", "hf"]) that states wether the dataset should be loaded from disk or huggingface(which can still be from disk, as cache path is given)
BaseDateModuleHF is changed:
_load_data is renamed to _load_and_configure_data
has calls to _load_data and _configure_data
_load_data instantiates DsManager from utils.path_utils.
DsManager handles loading and storing of data (according to "mode" flag)
This allows the use of the same data module for only locally stored datasets aswell
GADMEDataModule is changed:
_preprocess_data now calls _preprocess_multiclass if task is multiclass or _preprocess_multilabel if task is _multilabel
makes inheritance and readability easier
embeddings_datamodule/EmbeddingsDataModule is introduced:
specified to loading dataset that has column "embeddings" instead of "filename" to gather data.
requires different Transforms etc.
datamodule does not use sampling methods, yet
Configs are changed to reflect interface changes
We fix auroc in multiclass setting
we make average in map/cmap configurable
we add calibration error in multilabel
we create the logistic regression model for use on embeddings
We add files in
dataset/embeddings
acceleration of chirp-models was not possible, as tf.function decorator did not work with call to chirp library
datamodule is changed as follows:
Configs are changed to reflect interface changes
We fix auroc in multiclass setting
we make average in map/cmap configurable
we add calibration error in multilabel
we create the logistic regression model for use on embeddings