Open hvgazula opened 7 months ago
this suggestion works on a small dataset and doesn't fail. however, it is still time-consuming. The non-zero GPU util may have to do with this (🤷♂️ ) as there will be an overhead in going back and forth between the CPU and GPU.
https://github.com/neuronets/nobrainer/blob/cb855feaadd4ac354e1e2d1c760a649df3f61ab4/nobrainer/dataset.py#L249-L250
Suggestion:
Notes:
_labels_all_scalar
.isscalar
function which returns a bool flag. Subsequently, the collected bool flags are reduced to one final bool flag.nobrainer.tfrecord._is_int_or_float()
on each element of the dataset (in a for loop) and then reduce all the bool flags (similar to step 2).repeat
should be delayed until after this operation. Otherwise, the entire repeated dataset will be used for this operation and is undesirable.Caveat:
tf.experimental
function which may (or may not) be deprecated in the future. @satra what are your thoughts on using experimental features in the nobrainer API?