cgiffard / Downsize

Tag safe text truncation for HTML and XML!
BSD 3-Clause "New" or "Revised" License
41 stars 13 forks source link

Allow for 'contextual' tags which are not clipped. #11

Closed adam-zethraeus closed 10 years ago

adam-zethraeus commented 10 years ago

Christopher,

This is the commit i most want to land in the project.

The motivation is to modify Downsize to output nicer blog snippets in Ghost.

The idea is to allow the user to specify a list of tags whose content should not be cleaved. I believe this can be especially useful in allowing you to specify the core markdown top block elements ["p", "ul", "ol", "pre", "blockquote"] that have intra-article-level semantic meaning and NOT specify tags used for layout that don't covey any context (div etc.) or tags that convey context at a high level (article, header etc).

The implementation is a simple set intersection of the open tag stack and the passed list of 'contextual tags'.

I'd love to gauge your interest in this feature! If you're not opposed, please let me know if there's anything you'd like improved in the implementation!

The test won't pass without https://github.com/cgiffard/Downsize/pull/9. (If you do want to use all the commits I've thrown at you, you can also just merge my master branch and save the conflict annoyance.)

-adam

cgiffard commented 10 years ago

Hi! I'm just busy at work right now, but I really appreciate you taking the time to file these PRs, and rest assured I've noticed them. :)

I'll get to reviewing and merging this evening!

adam-zethraeus commented 10 years ago

Thanks for taking a look! :)

cgiffard commented 10 years ago

I'm not ignoring you — I've been really busy over the last couple of days. I'll try and get to this on the weekend. Really sorry for the delay!

adam-zethraeus commented 10 years ago

I appreciate the update :). No worries!

cgiffard commented 10 years ago

Hey! I went to merge this — but it looks like it isn't passing your test suite additions:

> downsize@0.0.5 test /Users/cgiffard/Development/Projects/text-downsize
> mocha -R spec test.js

  Word-wise truncation
    ✓ should be able to truncate across nested tags 
    ✓ should be able to truncate even if a single quote is found inside a string of double quotes or vice-versa 
    ✓ should be able to naively balance HTML markup 
    ✓ should be able to naively balance HTML markup 
    ✓ should ignore erroneously unescaped carets 
    ✓ should ignore comments in markup, and carets in comments 
    ✓ should understand implicitly void tags, and not attempt to close them 
    ✓ should understand self-closing tags, and not attempt to close them 
    ✓ should understand self-closing tags, even marked up poorly. 
    ✓ should close unknown tags appropriately 
    ✓ should permit unescaped carets inside double-quoted strings 
    ✓ should permit unescaped carets inside single-quoted strings 
    ✓ should properly recognised manually closed elements, and do not re-close elements 
    ✓ should properly handle unicode languages 
    ✓ should properly handle unicode languages across nested tags 
    ✓ should properly properly character-truncate across tag boundries 
    1) should await the end of the containing paragraph
    2) should await the end of the containing unordered list

  Appending
    ✓ should properly append an ellipsis where required 
    ✓ should not append an ellipsis where not required 

  Performance
    truncate five words from a four-million word corpus one hundred thousand times
      ✓ benchmark time should be under twenty seconds (9721ms)

  ✖ 2 of 21 tests failed:

  1) Word-wise truncation should await the end of the containing paragraph:

      actual expected

      "<p>there are more than seven words in this paragraph</p><p>this is unrelated</p>"

  2) Word-wise truncation should await the end of the containing unordered list:

      actual expected

      "<ul><li>item one</li><li>item two</li><li>item three</li></ul><p>paragraph</p>"

Any thoughts? Am I doing it wrong?

adam-zethraeus commented 10 years ago

I can't tell for sure from the paste of the 'actual expected's, but could it be that you merged it without merging the other one i mentioned in the PR (https://github.com/cgiffard/Downsize/pull/9) first?

Sorry for having them as separate individual commits, I'm not really sure on the GitHub etiquette for mergability vs individual issues per PR!

cgiffard commented 10 years ago

Ah, I didn't notice they weren't individually mergeable. My apologies.

cgiffard commented 10 years ago

Oh, man, you documented this everywhere too. My brain just isn't working properly today! Sorry about that.

cgiffard commented 10 years ago

I've merged it manually. Thanks for all your help!

cgiffard commented 10 years ago

I'm going to document this for the readme — might be back in a little bit to make sure you're happy with it. :)

adam-zethraeus commented 10 years ago

Awesome, thanks! Let me know if I can do anything else. Can you ping me when you npm publish? I have a ghost API commit to expose this mostly ready. :)

cgiffard commented 10 years ago

Published. 0.0.6. :)