Closed jakedailey1 closed 7 years ago
Have you tried on Linux? I suspect an issue with the size of long ints in Windows. The only supported setup is technically for files saved/loaded in the same system, both for Torch and this library. Perhaps if lua torch loads it fine on windows, they're remapping long to int64_t, rather than using windows' longs.
I think you're right. I was able to open the file on a Linux box and reformat to something I can work with on Windows.
I don't think there's a direct workaround, per this Python bug post, it seems this is bound to happen if you the original serialization was done using the machine's native formatting.
I'm sure typical users will not face this issue, but do you know if t7 files have anything explicit that we can use to warn users about these sorts of system/type mismatches (or is this just implicit in the binary)?
Unfortunately, binary t7 files were never meant to be portable, so no. The "fix" would be to have changeable settings for the size of each datatype when loading a file, as the t7 files don't save that themselves. I'd suggest hdf5 or ascii t7 files as a portable alternative that may suit most of your needs.
I see. Thanks for your help with this @bshillingford, I'll close this out now.
Disclaimer: new to trying to work with serialized data, so this may be user error.
I've been working through implementing Professor de Freitas's Machine Learning course materials in Tensorflow (here) and looking to use the data here to write the LSTM from Practical 6...I was delighted to find that you'd already gone to the trouble of writing this package!
torchfile.load()
works fine for vocab.t7, but returns an empty array when I try runningtorchfile.load('train.t7')
. When I dig into the code deeper, it seems thatT7Reader
correctly finds thattypeidx == 4
(Torch) after calling reader.read_obj() a first time.The code proceeds to call
type_handlers
but then on the second call toread_obj()
withinread_tensor_generic
, finds that typeidx==0. The result is thatstorage is None
, causing the code to return an empty array.For reference, the
class_name
found is torch.ByteTensor and theversion_number
returned is 1. I'm running Python 3.5 via Conda on Windows x64.Thank you! Jake