Closed jakeyeung closed 6 years ago
Ah sorry. I'm working to add support for stranded sequencing datasets. I pushed a fix to master.
Hello Dave,
May I ask the usage of identifier column? Because I'm trying to use another data downloaded from ENCODE to run the basenji_hdf5_single.py, however failed in:
[urlOpen] Couldn't open identifier for reading [urlOpen] Couldn't open identifier for reading [pyBwOpen] bw is NULL!
What should be the content inside this identifier? Thank you!
Best regards, Webb
Hi Webb, could you provide some more details about the command that you ran? I now suggest using TFRecords rather than HDF5 for input data using my basenji_data.py
script. You can see an example of that here: https://github.com/calico/basenji/blob/master/tutorials/preprocess.ipynb
Hi Dave, OK. I see! So there's no need to use the file "heart_l131k.h5" in the train_test.ipynb tutorial now right? Thank you very much! Best regards, Webb
Yes, that's right. I see now that I accidentally left that in the tutorial. I just uploaded a new version that excludes that file.
Hello Dave,
I was wondering what the input format is for
sample_wig_files
in the inputs forbasenji_hdf5_single.py
?When I looked at the tutorial, it suggested a format with two columns (from
data/heart_wigs.txt
):Unfortunately, when I run
basenji_hdf5_single.py
it gives me an index error at line 186:It looks like it was trying to access
a[2]
, which would be the third column ofdata/heart_wigs.txt
, which does not exist inheart_wigs.txt
What is the expected input format for
sample_wig_files
?Best,
Jake