Open h4ck4l1 opened 1 week ago
If you require me to make the dataset public if you want to run in your environments if it's an unique issue, I will make it public.
edit: made it public.
edit2: All you need is to give two secrets one with "username" as key and its value being your kaggle username, and key "key" with the unique key you get from kaggle.json and run it, the notobook will stop at predicting.
Wait I solved it but not exactly because I ended up with more doubts. I zipped my test dataset with dummy Y values.Then the predict works properly why is that?.
This below does not work at all. It gives me improper shape warning.
BATCH_SIZE = 256
TEST_STEPS = len(test.tfrecord)//BATCH_SIZE
test_ds = (
tf.data.TFRecordDataset("belka-tfrecords/test_10.tfrecord",compression_type="GZIP",num_parallel_reads=AUTO)
.map(test_belka_example,num_parallel_calls=AUTO)
)
X_test,y_test = total_test_ds.take(1).get_single_element()
print("test molecule smiles: ",X_test[0].shape) # (256, 1024)
print("test molecule tokens: ",X_test[1].shape) # (256, 142)
model.predict(test_ds,steps=TEST_STEPS,use_multiprocessing=True)
This down below works...
BATCH_SIZE = 256
TEST_STEPS = len(test.tfrecord)//BATCH_SIZE
test_ds = (
tf.data.TFRecordDataset("belka-tfrecords/test_10.tfrecord",compression_type="GZIP",num_parallel_reads=AUTO)
.map(test_belka_example,num_parallel_calls=AUTO)
)
dummy_y = (
tf.data.Dataset.from_tensor_slices(tf.random.uniform(shape=[1674896],minval=0,maxval=2,dtype=tf.int32))
)
total_test_ds = tf.data.Dataset.zip((test_ds,dummy_y)).batch(256,num_parallel_calls=AUTO)
X_test,y_test = total_test_ds.take(1).get_single_element()
print("test molecule smiles: ",X_test[0].shape) # (256,1024)
print("test molecule tokens: ",X_test[1].shape) # (256,142)
print("test y: ",y.shape) # (256,)
model.predict(test_ds,steps=TEST_STEPS,use_multiprocessing=True)
Why does a test dataset without y values not work directly?.
I get this error when trying to predict on a tfrecord dataset
Error Message:
My Inputs are of shape ([BATCH_SIZE,1024],[BATCH_SIZE,142]) and this is notebook
Explanation:
I am using a encoder-decoder transformer. I have checked for any shape mismatch issues but none came up.
Please and Thankyou for any valuable feedback