znerol / node-delta

Delta.js - A JavaScript diff and patch engine for DOM trees
http://znerol.github.com/node-delta/
MIT License
46 stars 11 forks source link

Distribution for M$ #16

Open TWINGSISTER opened 2 years ago

TWINGSISTER commented 2 years ago

I am trying to build a support for history in Geogebra (to give some evidence whenever you use GGB in activities with auto grading in Moodle). GGB stores its internal status in a large XML and snapshotting it at every student's step to get some evidence, for all the students in all classes for the whole school year, will sink Moodle MySQL DB. At every step few things changes so storing DOM diffs will be a game changer. When some evidence/explanation is required for a bad note teacher will show the history created applying deltas and showing the student's process. Surprisingly Stackoverflow do not contain any hint for diff-ing XMLs. Obviously I can utter a bunch of recursive js lines that can do the job but I do not like to reinvent the wheel. To fit in GGB Javascript it must be a single file js. Working on a M$ W10 I found difficult to make. Can you provide a distribution where all Make are done with GitHub Actions?

znerol commented 2 years ago

A preliminary caveat, this repository holds the artifacts produced during a students thesis. The only purpose of this work is to demonstrate that the algorithm discussed in the paper isn't utterly flawed in any obvious way. I myself never used the code in production, thus it is safe to assume that the project is unmaintained and it will likely remain that way.

Your problem description seems to touch multiple areas which might be better solved independently.

The storage problem

In your position I'd probably try some simpler steps in order to reduce storage requirements and database traffic:

  1. XML is extremely redundant. It follows that the textual representation is extremely compressible. Thus, I recommend to just gzip the data files before sending them to the database, and decompress them before delivering them to the client. You might even try to send them out in compressed state to the client and just add an appropriate Content-Encoding HTTP header.
  2. As an alternative, avoid storing the files in the database and rather use some more appropriate backend (e.g., some Swift/S3 compatible object storage service).

The document history problem

If the storage problem is solved, then it still might be desirable to visualize the evolution of a document. But that is something which is best solved in the client software. The diffs alone will very likely not be useful for the average student / instructor.

I recently learned that a team around github user @milos-cuculovic is working on XML diffing strategies to streamline the review process of scientific papers. Their aim is specifically to extract the semantics of a change in order to present modifications between revisions of a work in a more useful way to human beings. I guess that this approach might be more appropriate to solve the document history problem.