calico / basenji

Sequential regulatory activity predictions with deep convolutional neural networks.
Apache License 2.0
410 stars 126 forks source link

Expose 'raw' as an argument? #173

Open Yaoyx opened 1 year ago

Yaoyx commented 1 year ago

https://github.com/calico/basenji/blob/615b9eec8a591783b16d959029ddad08edae853d/basenji/dataset.py#L215

It currently returns index format for dna sequence as default. Should this be exposed in case hdf5 file has index format, which seems to be the default (https://github.com/calico/basenji/blob/615b9eec8a591783b16d959029ddad08edae853d/bin/basenji_data_write.py#L157)

This would, for example, make the explore_model.ipynb data loading consistent with the input shape and format required for a model prediction.

davek44 commented 1 year ago

I won't have time to look into this for awhile. It sounds like you understand what ought to happen, so feel free to submit a pull request.

Yaoyx commented 1 year ago

I won't have time to look into this for awhile. It sounds like you understand what ought to happen, so feel free to submit a pull request.

Sounds good. I just submitted a pull request.