scala / scala-xml

The standard Scala XML library
Apache License 2.0
297 stars 92 forks source link

Unique identifier for the Node instances #403

Open geirolz opened 4 years ago

geirolz commented 4 years ago

Hi, i'm writing a Scala library to add a functional layer to this library. In my library i've defined a rule to edit the xml document. This rule contains:

With this rule instance i create a standard RewriteRule that will be applied to the whole document using a RuleTransformer.

The code is something like this:

val document : NodeSeq = ???
val rule : XmlRule = ???
val target : NodeSeq = rule.zoom(document)

val rwr = new RewriteRule {
  override def transform(ns: Seq[Node]): Seq[Node] = {
    if(ns == target)
      rule.action(target)
    else 
      ns 
  }
}

Problem The problem is: when the document has nodes with the same values ns == target returns true multiple times, also if that node doesn't descends from the same parent.

Question There is a way to obtain an unique id for each node ? If not, it should be a valid proposal generating an unique id during the node instantiation and keep it as node field ?

abstract class Node {
  private lazy val uuid : UUID = UUID.randomUUID() //or something else

  def sameInstance(that: Node) : Boolean = {
    this.equals(that) && this.uuid.equals(that.uuid)
  }
} 

val n1 : Node = <Node />
val n2 : Node = <Node />
n1.sameInstance(n2) //false
ashawley commented 4 years ago

Do you have a minimal example of the XML and calling the RewriteRule that shows your issue? Some examples of using RewriteRule are at the bottom of the "Getting started" page of the wiki https://github.com/scala/scala-xml/wiki/Getting-started

The problem is: when the document has nodes with the same values ns == target returns true multiple times, also if that node doesn't descends from the same parent.

I've never used it for XML, but Scala does have eq. It works for scala.xml.Node:

scala> n1 eq n2
res0: Boolean = false

scala> n1 eq n1
res1: Boolean = true

However, RewriteRule has a tendency to copy the XML tree while transformation happens, so eq may not work for you.

geirolz commented 4 years ago

Here an example:

import cats.implicits._

import scala.util.{Failure, Success, Try}
import scala.xml.{NodeSeq, _}
import scala.xml.transform.{RewriteRule, RuleTransformer}

val doc = <persons><person name="David" /></persons>

case class Rule(zoom: NodeSeq => NodeSeq, action: NodeSeq => Try[NodeSeq]) { $rule =>

  def toRewriteRule(wholeDocument: NodeSeq) : RewriteRule = new RewriteRule {

    val target = $rule.zoom(wholeDocument)
    val targetUpdated = $rule.action(target)

    println("aa: " + target)
    override def transform(ns: collection.Seq[Node]): collection.Seq[Node] =
      if (ns.sameElements(target))
        targetUpdated.get
      else
        ns
  }
}

val rule : Rule = Rule(
  zoom = _ \ "person" filter(_ \@ "name" == "David"),
  ns => ns.map{
    case e: Elem => Success(e.copy(child = e.child ++ <ToAppend/>))
    case _ => Failure(new RuntimeException("Invalid type."))
  }.toList
    .sequence
    .map(ns => NodeSeq.fromSeq(ns))
)

val rt: Seq[Node] = new RuleTransformer(rule.toRewriteRule(doc)).transform(doc)

This works but if the input document was:

<persons>
  <person name="David" />
  <node1>
    <node2>
      <person name="David" /> <!-- SAME AS FIRST -->
    </node2>
  </node1>
</persons>

As result i've got:

<persons>
  <person name="David"><ToAppend/></person>
  <node1>
    <node2>
      <person name="David"><ToAppend/></person>
    </node2>
  </node1>
</persons>

This is right for the implementation i've done but i'm asking you if there is a way to resolve my problem

ashawley commented 4 years ago

There's still a lot happening in your example, but thanks for trying to minimize it. I've tried to simplify it further to show how to write the target, and how to match in the RewriteRule. It seems like you can just use eq instead of == to match the actual target the one time. Hopefully it clarifies a few things.

val doc =
  <persons>
    <person name="David" />
    <skip>
      <person name="David" />
    </skip>
  </persons>
val target: Node = (doc \ "person").filter(_ \@ "name" == "David").head
val update: Seq[Node] = target.map {
  case elem: Elem => elem.copy(child = (elem.child ++ <toAppend/>))
  case n => n
}
val rewriteRule = new RewriteRule {
  override def transform(n: Node): Seq[Node] = {
    if (target eq n)
      update
    else
      n
  }
}
val transform = new RuleTransformer(rewriteRule)
println(transform(doc))