OpenUserJS / OpenUserJS.org

The home of FOSS user scripts.
https://openuserjs.org/
GNU General Public License v3.0
856 stars 302 forks source link

Store rendered Markdown/Markup #638

Open sizzlemctwizzle opened 9 years ago

sizzlemctwizzle commented 9 years ago
var mongoose = require('mongoose');
var Schema = mongoose.Schema;

var renderedContentSchema = new Schema({
  content: String,
  model: String,
  _contentId: Schema.Types.ObjectId
});

var RenderedContent  = mongoose.model('RenderedContent', renderedContentSchema);

exports.RenderedContent = RenderedContent;

This is where output from renderMd can be stored, so that we only render markdown when it changes.

Martii commented 9 years ago

This might be considered a bad idea if our sanitizer discovers a security hole and we have to redo the entire database. There is already precedence for this occasion.

sizzlemctwizzle commented 9 years ago

I'll build in a routine to re-render all stored content. I'll even put a button for it on the admin dashboard. On Jun 5, 2015 9:51 AM, "Marti Martz" notifications@github.com wrote:

This might be considered a bad idea if our sanitizer discovers a security hole and we have to redo the entire database. There is already precedence for this occasion.

— Reply to this email directly or view it on GitHub https://github.com/OpenUserJs/OpenUserJS.org/issues/638#issuecomment-109318002 .

Martii commented 9 years ago

Perhaps a flag check based off last updated compared to last rerendering date could work too to let the users do it on revisitation of those affected pages could work too... but we don't currently store that information in the db.

sizzlemctwizzle commented 9 years ago

Then we'd still have a security hole as long as the content hasn't expired. With my process we could fix everything immediately. Plus less re-rendering. On Jun 5, 2015 10:03 AM, "Marti Martz" notifications@github.com wrote:

Perhaps a flag check based off last updated compared to last rerendering date could work too to let the users do it on revisitation of those affected pages could work too... but we don't currently store that information in the db.

— Reply to this email directly or view it on GitHub https://github.com/OpenUserJs/OpenUserJS.org/issues/638#issuecomment-109321442 .

Martii commented 9 years ago

Not quite... on access it would check lastupdate versus the lastforced and any visitor would trigger the rendering and rerender it if needed... thus the (potentially) infected content never gets displayed to anyone. Your methodology is less scalable and will take a very long time when the db is much larger... USO had downtime because of this e.g. hours.

Martii commented 9 years ago

Related to #81 and #601 as well... one big db redo is going to be traffic intensive.

sizzlemctwizzle commented 9 years ago

I see what you mean and I agree. This let's us re-render when we need to rather than all at once. On Jun 5, 2015 10:12 AM, "Marti Martz" notifications@github.com wrote:

Not quite... on access it would check lastupdate versus the lastforced and any visitor would trigger the rendering and rerender it... thus the infected content never gets displayed to anyone. Your methodology is less scalable and will take a very long time when the db is much larger... USO had downtime because of this e.g. hours.

— Reply to this email directly or view it on GitHub https://github.com/OpenUserJs/OpenUserJS.org/issues/638#issuecomment-109323813 .

joeytwiddle commented 9 years ago

If this collection is just a cache, you could wipe the entire collection if a rendering issue is discovered. No need to store/compare dates.

I agree re-rendering and caching on demand sounds more scalable!

sizzlemctwizzle commented 9 years ago

@joeytwiddle might be right. I don't think it takes long destroy an entire collection. Once the collection is gone, all content will be forced re-render on demand. Of course our inevitable need to use sharding may make this slower. I really don't know.

Martii commented 9 years ago

@sizzlemctwizzle That's a null check to see if it's even been rendered e.g. date check... lastchecked vs lastforced is a general analogy which can easily be compared to before time and the last time. e.g. null vs exists. "Wiping" will still take longer especially when considering all comments/issues/discussions are rendered as well and not just script homepages... our DB isn't that large at the moment but give it a few more years and the performance could be potentially adversely affected.

Martii commented 9 years ago

Another miscellaneous note as well... DB compaction... eventually wiping could potentially fragment the database even worse than an overwrite... we haven't touched on that yet since the change at https://github.com/OpenUserJs/OpenUserJS.org/commit/d64f754af5818301648354e32596ed922d683e95

Since OUJS is now on SSD's it might be prudent to check the warranty status with the host provider and how often they perform backups of our VPS, replacements, etc. and if they do any compression anywhere.