Open lzamparo opened 8 years ago
i met the same problem using the docker image, with Python 2.7.6. But seems to be solved by editing the file Basset/src/seq_hdf5.py, line 81-89, adding int() to test_count, train_count and valid_count
train_count = seqs.shape[0] - test_count - valid_count
train_count = int(batch_round(train_count, options.batch_size))
print >> sys.stderr, '%d training sequences ' % train_count
test_count = int(batch_round(test_count, options.batch_size))
print >> sys.stderr, '%d test sequences ' % test_count
valid_count = int(batch_round(valid_count, options.batch_size))
print >> sys.stderr, '%d validation sequences ' % valid_count
I guess the error in line 92 of current seq_hdf5.py while running python 2.7.5 is related. Onceupon, could your request a pull?
[ca445@cbsugpu01 bassetfiles]$ python /usr/local/Basset-0.1.0/src/seq_hdf5.py -r -c -v 3000 -t 3000 learn_cd4.fa lt.txt learn_cd4.h5
85261 training sequences
3000 test sequences
3000 validation sequences
Traceback (most recent call last):
File "/usr/local/Basset-0.1.0/src/seq_hdf5.py", line 130, in <module>
main()
File "/usr/local/Basset-0.1.0/src/seq_hdf5.py", line 92, in main
train_seqs, train_targets = seqs[i:i+train_count,:], targets[i:i+train_count,:]
TypeError: slice indices must be integers or None or have an __index__ method
Hi,
Love the package, I'm keen to try it out for myself. Just wanted to point out that it doesn't seem to play well with Python 3. Some errors are easily fixed by running 2to3 on the relevant .py files, but some are not. For example, try running install_data.py using Anaconda Python 3. I get the following:
The same script seems to succeed using Anaconda Python 2.7.1 (though I can't be sure, the seq_hdf5.py step takes a while to complete). I'll use that for my purposes, but maybe you should update the readme to explicitly say python 2 is required?