Open suyashkumar opened 4 years ago
@suyashkumar any updates on this? I see that the tutorial has been published which uses these API's. https://www.tensorflow.org/io/tutorials/genome.
Please let me know if this can be closed.
Hi @kvignesh1420 I don't think we got back around to exposing a return value of a tf.data.Dataset
--this may be a somewhat useful API, but is not necessary for folks to take advantage of the genome functionality.
@suyashkumar It would be nice to have such an API. Please let me know your suggestions on this issue. If you can contribute and close this, that would be great.
Opening this issue to track the development of the
tf.data.Dataset
API fortfio.genome
ops discussed in my last PR #620.We should be able to build a
tf.data.Dataset
in eager mode by combining some of current genome ops.Ideally we would expose something like
tfio.IODataset.from_fastq(filenames, convert_quality=false, convert_to_onehot=false)
that would read the fastq file(s), and optionally convert the nucleotides to onehot representations and/or convert the quality to probabilities based on the arguments to the call and return atf.data.Dataset
.Will begin work on this sometime this week or weekend!