Why convert SparseVector to DenseVector in your DefaultTfRecordRowEncoder.scala？


     case VectorType => {
        val field = row.get(index)
        field match {
          case v: SparseVector => FloatListFeatureEncoder.encode(v.toDense.toArray.map(_.toFloat))
          case v: DenseVector => FloatListFeatureEncoder.encode(v.toArray.map(_.toFloat))
          case _ => throw new RuntimeException(s"Cannot convert $field to vector")
        }
      }

I found this code in your DefaultTfRecordRowEncoder.scala, explicitly converse a SparseVector to a DenseVector.

I have a 1000-dimentional feature vector in my DataFrame which has about 90 non-zero values. So this conversion make the size of tfrecord dataset very much larger than snappy.parquet in Spark.

I'm a little confused about the conversion.

tensorflow / ecosystem

Why convert SparseVector to DenseVector in your DefaultTfRecordRowEncoder.scala？ #146