Open skattoor opened 6 years ago
You can add your own helper function that does that, like this:
implicit class RichDF(val ds:DataFrame) {
def showHTML(limit:Int = 20, truncate: Int = 20) = {
import xml.Utility.escape
val data = ds.take(limit)
val header = ds.schema.fieldNames.toSeq
val rows: Seq[Seq[String]] = data.map { row =>
row.toSeq.map { cell =>
val str = cell match {
case null => "null"
case binary: Array[Byte] => binary.map("%02X".format(_)).mkString("[", " ", "]")
case array: Array[_] => array.mkString("[", ", ", "]")
case seq: Seq[_] => seq.mkString("[", ", ", "]")
case _ => cell.toString
}
if (truncate > 0 && str.length > truncate) {
// do not show ellipses for strings shorter than 4 characters.
if (truncate < 4) str.substring(0, truncate)
else str.substring(0, truncate - 3) + "..."
} else {
str
}
}: Seq[String]
}
publish.html(s""" <table>
<tr>
${header.map(h => s"<th>${escape(h)}</th>").mkString}
</tr>
${rows.map { row =>
s"<tr>${row.map{c => s"<td>${escape(c)}</td>" }.mkString}</tr>"
}.mkString}
</table>
""")
}
}
Result:
I cannot find publish.html() Where does it come from? From which lib?
@kretekpodnietek , should be available for for jupyter-scala kernel: https://github.com/jupyter-scala/jupyter-scala#displaying-html--images--running-javascript
@Aivean: Never took the time to thank you for this. This works beautifully and I made good use of it ever since you answered. Thank you very much ! 😃
@kretekpodnietek , should be available for for jupyter-scala kernel: https://github.com/jupyter-scala/jupyter-scala#displaying-html--images--running-javascript
@Aivean Could u pls share that publish.html() , not able to find it
@venkatnbcu , from what I can tell, API has changed since I posed this snippet. Quick googling shows that publish
is now a member of kernel
: https://almond.sh/docs/api-jupyter.html#display-data
Thank goodness this is possible!! 🙌
I know this was for the Almond kernel, but for anyone else using the Apache Toree kernel, I managed to adapt this and thought I'd share:
import org.apache.spark.sql._
implicit class RichDF(val df: DataFrame) {
def view(limit:Int = 20, truncate: Int = 20) = {
import xml.Utility.escape
val data = df.take(limit)
val header = df.schema.fieldNames.toSeq
val rows: Seq[Seq[String]] = data.map { row =>
row.toSeq.map { cell =>
val str = cell match {
case null => "null"
case binary: Array[Byte] => binary.map("%02X".format(_)).mkString("[", " ", "]")
case array: Array[_] => array.mkString("[", ", ", "]")
case seq: Seq[_] => seq.mkString("[", ", ", "]")
case _ => cell.toString
}
if (truncate > 0 && str.length > truncate) {
// do not show ellipses for strings shorter than 4 characters.
if (truncate < 4) str.substring(0, truncate)
else str.substring(0, truncate - 3) + "..."
} else {
str
}
}: Seq[String]
}
kernel.display.html(s""" <table>
<tr>
${header.map(h => s"<th>${escape(h)}</th>").mkString}
</tr>
${rows.map { row =>
s"<tr>${row.map{c => s"<td>${escape(c)}</td>" }.mkString}</tr>"
}.mkString}
</table>
""")
}
}
...
df.view()
That would be neat. I searched around but didn't find what I was looking for. Any help appreciated !