bhollis / maruku

A pure-Ruby Markdown-superset interpreter (Official Repo).
MIT License
502 stars 80 forks source link

Undesired use of XML <short /> tags #24

Closed sorbits closed 12 years ago

sorbits commented 14 years ago

If I have an input document like:

<foo></foo>

Maruku will output that as:

<foo />
andyfowler commented 14 years ago

This is particular troublesome for <iframe> tags which Maruku renders as self-closing. Then they swallow the rest of the page.

distler commented 13 years ago

Good point. Fixed in my Nokogiri branch.

stepheneb commented 13 years ago

For some iframes (like a youtube video embed) you can workaround this by putting some placeholder text in the iframe element:

<iframe width="560" height="345" src="http://www.youtube.com/embed" frameborder="0">placeholder</iframe>

I've got a pull request into the master branch that adds a failing iframe spec test: https://github.com/nex3/maruku/pull/39

Jacques,

I took a look at your nokogiri branch to see if it was worth merging into that branch but I'm getting 67 failures running the block tests in Ruby 1.9.2:

$ rspec spec/block_spec.rb
...
576 examples, 67 failures

Is that similar to what you are getting?

I looked but didn't see the change you made on the nokogiri branch that prevents iframe elements from self-closing.

distler commented 13 years ago

There are a fairly large number of failures to do with the to_md and to_s methods. I don't think they have anything to do with my Nokogiri branch, per-se. The same tests are borked in the main branch.

(I have to say that I was rather annoyed to see lots of tests which were artificially made to "pass", by accepting plainly-bogus output as "correct." I "fixed" many of those, preferring to have a failing test than to have incorrect behaviour marked as correct.)

I looked but didn't see the change you made on the nokogiri branch that prevents iframe elements from self-closing.

Using Nokogiri's XHTML output fixes that issue. Specifically, look at this commit.

stepheneb commented 13 years ago

When I run the block tests on the iframe branch (one commit from master) in ruby 1.8.7 I get 4 failures (2 of which are from my iframe test) and 36 pending:

$ spec spec/block_spec.rb
...
570 examples, 4 failures, 36 pending

The two remaining look like encoding issues:

3)
'A Maruku document for the ./spec/block_docs/lists10.md file should have the expected #to_md output' FAILED
expected: "List:\n\n- è`gcc`",
     got: "List:\n\n-\250" (using ==)

4)
'A Maruku document for the ./spec/block_docs/lists11.md file should have the expected #to_md output' FAILED
expected: "- ένα",
     got: "-\255να" (using ==)

When I run the same tests in Ruby 1.9.2 with rspec 1.3.2 I get 30 additional failures:

$ spec spec/block_spec.rb
...
570 examples, 34 failures, 36 pending

I had to push another commit into the iframe branch to fix the require of spec_helper for Ruby 1.9.2.

Some of the to_html errors here are just because the attribute values are in a different order from what is expected:

1)
'A Maruku document for the spec/block_docs/alt.md file should have the expected #to_html output' FAILED
expected: "<p><img src='/foo.jpg' alt='bar' /></p>",
     got: "<p><img alt='bar' src='/foo.jpg' /></p>" (using ==)

In general it's not reasonable to count on attribute order preservation with xml parsers -- but it seems the rexml in Ruby 1.8.7 preserved attribute position.

Does Nokogiri preserve attribute order or did you implement a more reliable comparison test?

stepheneb commented 13 years ago

Jacques,

I don't see a place to send issues/features for the Ruby Gem itexttomml. I looked in these two locations:

But I wanted you to know I created a pull request for homebrew to add a recipe for installing the C library itex2MML:

https://github.com/mxcl/homebrew/pull/7441

I used the recipe to install itex2MML for use by the itexttomml gem -- so I could run the Maruku tests.

distler commented 13 years ago

As I said, a huge number of your so-called "passing" specs are fakes: they only pass because someone decided to silence the failing test by inserting bogus "expected" output.

For the record, I see

576 examples, 49 failures, 36 pending

Those 49 failures break down as follows:

38 to_md 6 to_latex 3 inspect 2 to_html

The to_md method is hopelessly broken (or, never completed, depending on your tastes). So I don't see those failures as particularly interesting. (I don't intend to fix them, and neither, apparently, did the fellow who made them "pass" on the main branch.)

Of the remaining failures, the 2 to_html (and the corresponding inspect failures) have to do with the fact that I have itex2mml ouput turned on by default. The output is the correct MathML, and I just haven't bothered to deal with the test.

That leaves the to_latex tests, which all have to do with escaping of special characters (which looks OK to me, but differs from the "expected" output).

Now, if 49 failed tests bothers you, I could very easily make them all pass (as the fellow who "fixed" the specs that you are running off the main branch), but I don't think that would improve the program's performance.

Does Nokogiri preserve attribute order or did you implement a more reliable comparison test?

No, it doesn't. But it does, reliably, output the attributes in the same order, each time you run the tests.

distler commented 13 years ago

I used the recipe to install itex2MML for use by the itexttomml gem -- so I could run the Maruku tests

gem install itextomml

should compile and install the requisite shared library, all by itself.

That didn't work for you?

I don't see a place to send issues/features for the Ruby Gem itexttomml.

I don't have a bug-tracker, but I do have a discussion forum.

stepheneb commented 13 years ago

gem install itextomml should compile and install the requisite shared library, all by itself.

I feel silly -- I didn't even try that!

The maruku gemspec is incomplete because it doesn't list any development dependencies.

I would be nice to write a complete gemspec and use bundler to manage the gem development dependencies.

After trying to run the specs I realized I needed rspec < 2.0 so I created an RVM gemset for maruku and installed rspec -v 1.3.2

Then running the specs I was getting failures telling me to "install itex2mml".

So I found the itex2MML site -- saw it was a C library and at first thought maruku spec test might use the library directly.

Tried to install it with homebrew but it didn't exist so I made a recipe.

Then I noticed the itextomml gem mentioned here itex2MML and went to look for it's source code so I could see how it used the itex2MML library -- couldn't find the source code.

So I installed itextomml and it worked -- which was where I made the incorrect assumption that having the C library previously installed was necessary.

stepheneb commented 13 years ago

Jacques,

I just sent you a pull request on your nokogiri branch that integrates bundler, Gemfile and a maruku.gemspec -- this will make it easier for other developers to help with development:

https://github.com/distler/maruku/pull/1

So ... after cloning maruku:

$ bundle install

$ bundle exec rake -T
rake build    # Build maruku-0.6.0.gem into the pkg directory
rake install  # Build and install maruku-0.6.0.gem into system gems
rake release  # Create tag v0.6.0 and build and push maruku-0.6.0.gem to Rubygems
rake spec     # Run RSpec
rake yard     # Generate YARD Documentation
distler commented 12 years ago

For what it's worth, all the to_html and to_latex tests now pass, on my nokogiri branch. And I mean genuinely pass, not just accept bogus output as 'correct'.

to_md is hopelessly broken, so I disabled those tests.

vsapronov commented 12 years ago

Hi, Just wanted to say that this converting tags to self-closing might be a bigger problem and in some cases can't be workarounded with space between open and close tags. For example if you'll put space between obviously it will change the value of the text field. Is there any chance to get the fix soon, especially in jekyll (since GitHub hosts static sites powered by jekyll)?

distler commented 12 years ago

Is there any chance to get the fix soon...?

Depending on how you look at it, this is either already fixed (on my Nokogiri branch -- see comment-2), or will never be fixed (because, with REXML, there's no HTML serializer).

bhollis commented 12 years ago

@distler Hi Jacques, you're a hard guy to get ahold of :-).

I've taken over maintainership of Maruku, but it looks like you've done a ton on the project in your own fork. I'll be looking through your commits (and the other forks) and pulling stuff in, but if there's stuff you'd like to have merged, I'll be paying special attention to pull requests. My hope is to get things updated and release a new version of Maruku to rubygems sometime in the next few weeks.

distler commented 12 years ago

I would have thought that I was easy to find.

Anyway, the switch from REXML to Nokogiri is kind of an all-or-nothing choice, since the APIs of these two libraries are rather different.

I suppose there are improvements that could be back-ported to the main branch, but I'm not inclined to spend the time figuring out what those are. Nokogiri is way faster, and resolves bugs like this one.

One other piece of advice: many of the spec tests on the main branch are worthless. Bogus, incorrect output has been marked as 'correct', in order to artificially make the tests pass. One of your first tasks should be going through those tests, one-by-one, to see whether the "expected" output is actually what you expect.

bhollis commented 12 years ago

Yeah, I'm excited about the nokogiri switch. Thanks for the advice. Really, I just wanted to let you know that pull requests will actually be looked at and merged now.

bhollis commented 12 years ago

Closing since I've merged in @distler's fixes.