When attempting to load any TFRecords file into a torch dataset, the dataset cannot be properly queried. Similar to #76, but instead of a RuntimeError: Failed to read the record., we get a different error this time (see below).
from torch.utils.data import DataLoader
from profit.utils.data_utils.datasets import TorchTFRecordsDataset
data = TorchTFRecordsDataset("data/3gb1/processed/transformer_fitness/primary.tfrecords")
loader = DataLoader(data, batch_size=64, num_workers=2)
for batch in loader:
print([arr.shape for arr in batch.values()])
Current behavior
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/ayushkarnawat/miniconda3/envs/chem/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
data = self._next_data()
File "/Users/ayushkarnawat/miniconda3/envs/chem/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
return self._process_data(data)
File "/Users/ayushkarnawat/miniconda3/envs/chem/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
data.reraise()
File "/Users/ayushkarnawat/miniconda3/envs/chem/lib/python3.7/site-packages/torch/_utils.py", line 394, in reraise
raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/Users/ayushkarnawat/miniconda3/envs/chem/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/Users/ayushkarnawat/miniconda3/envs/chem/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 28, in fetch
data.append(next(self.dataset_iter))
File "/Users/ayushkarnawat/Documents/dev/python_workspace/profit/profit/utils/data_utils/datasets.py", line 299, in __iter__
for record in records:
File "/Users/ayushkarnawat/Documents/dev/python_workspace/profit/profit/utils/data_utils/tfreader.py", line 133, in tfrecord_loader
value = np.frombuffer(value[0])
ValueError: buffer size must be a multiple of element size
When attempting to load any TFRecords file into a torch dataset, the dataset cannot be properly queried. Similar to #76, but instead of a
RuntimeError: Failed to read the record.
, we get a different error this time (see below).Current behavior
Expected behavior