CounterFactualAI / kaenai

Kaen 🔥火炎🔥 is a toolkit for delightful deep learning with PyTorch in public clouds.
GNU Affero General Public License v3.0
3 stars 2 forks source link

MLOps Engineering book error #2

Open Philmod opened 1 month ago

Philmod commented 1 month ago

Hi,

I tried to use osds with the examples from your book, but each time it returns this type of error (same in the previous chapter when using files from s3).

train_ds = osds('https://raw.githubusercontent.com/osipov/smlbook/master/train.csv', batch_size=int(model.hparams.batch_size))
WARNING:root:partitions_glob is not specified at initialization, attempting to proceed with index_shards=True which can take a while for large objects.
[...]
FileNotFoundError: [Errno 2] No such file or directory: 'filecache::https://raw.githubusercontent.com/osipov/smlbook/master/train.csv'
Philmod commented 1 month ago

In the meantime, I use:

class CustomDataset(Dataset):
    def __init__(self, dataframe):
        self.dataframe = dataframe

    def __getitem__(self, index):
        row = self.dataframe.iloc[index].to_numpy()
        features = row[1:]
        label = row[0]
        return features, label

    def __len__(self):
        return len(self.dataframe)

train_df = pd.read_csv('https://raw.githubusercontent.com/osipov/smlbook/master/train.csv')
train_ds = CustomDataset(dataframe=train_df)
train_dl = DataLoader(train_ds, pin_memory=True)
Philmod commented 1 month ago

Also, in the same chapter 10, I'm getting another error: 'list' object has no attribute 'squeeze_', unclear if it's linked to the previous issue.