Examples of Unlabeled data

mdinesh9 commented 2 months ago

Hi @Alcoholrithm

Example shown the readme seems to be using labeled dataset. By any chance, are there any examples of using the library for unlabeled dataset for example VIME with unlabeled dataset?

Thanks.

Alcoholrithm commented 2 months ago

Hi @mdinesh9

Thank you for your question!

While the example in the README uses a labeled dataset, it's important to note that the first phase learning of VIME is specifically designed for unlabeled datasets. During this phase, both the "X" and "unlabeled_data" parameters of VIMEDataset are treated as the same type of data — unlabeled data.

So, if you only have an unlabeled dataset, you can still perform the self-supervised learning step of VIME. Simply pass your unlabeled data to the "X" parameter and set "unlabeled_data" to None when initializing the VIMEDataset. This will allow the model to learn from the unlabeled data without labeled dataset.

The following code block provides an example for your use case.

### First Phase Learning
train_ds = VIMEDataset(X = X_train, unlabeled_data = None, config=config, continuous_cols = continuous_cols, category_cols = category_cols)
valid_ds = VIMEDataset(X = X_valid, config=config, continuous_cols = continuous_cols, category_cols = category_cols)

datamodule = TS3LDataModule(train_ds, valid_ds, batch_size, train_sampler='random')

trainer = Trainer(
                    accelerator = 'cpu',
                    max_epochs = 20,
                    num_sanity_val_steps = 2,
    )

trainer.fit(pl_vime, datamodule)

I hope this clarifies your question.

mdinesh9 commented 2 months ago

Thank you @Alcoholrithm

Alcoholrithm / TabularS3L

Examples of Unlabeled data #16