uber / petastorm

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Apache License 2.0
1.76k stars 281 forks source link

TypeError: __init__() missing 2 required positional arguments: 'instance' and 'token' #790

Open devVipin01 opened 1 year ago

devVipin01 commented 1 year ago

I want to load Parquet row groups into batches. below code working fine on my local system-

# dbfs:/output/scaled.parquet
batch_size=2
with make_batch_reader('file:/dbfs:/output/scaled.parquet', num_epochs=1,shuffle_row_groups=False) as train_reader:
    train_ds = make_petastorm_dataset(train_reader).unbatch().map(lambda x: (tf.convert_to_tensor(x))).batch(batch_size)
    for batch in train_ds:
        X_train = tf.reshape(batch,(2,1,15))
        model.fit(X_train,X_train)

But when i am trying on DataBricks it return given error TypeError: init() missing 2 required positional arguments: 'instance' and 'token'

Help to resolve the issue