tristanjuricek / knockoff

A Markdown parser + object model in scala
http://tristanjuricek.github.com/knockoff
BSD 3-Clause "New" or "Revised" License
102 stars 17 forks source link

Handling \r\n newlines #26

Closed notnoop closed 14 years ago

notnoop commented 14 years ago

Knockoff doesn't handle newlines feeds well, as it only expects the \n delimiter and will fail if it sees a \n line ending (or a \r).

Here is script to reveal the problem:

scala> import com.tristanhunt.knockoff.DefaultDiscounter._                 
import com.tristanhunt.knockoff.DefaultDiscounter._

scala> knockoff("\n") // normal case                      
res21: Seq[com.tristanhunt.knockoff.Block] = ListBuffer()

scala> knockoff("\r\n") // abnormal case
' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
' foundmatching regex `[\t ]*\n' expected but `
[...]

' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
java.lang.StackOverflowError
    at java.util.regex.Pattern.atom(Pattern.java:1952)
    at java.util.regex.Pattern.sequence(Pattern.java:1834)
    at java.util.regex.Pattern.expr(Pattern.java:1752)
    at java.util.regex.Pattern.group0(Pattern.java:2530)
    at java.util.regex.Pattern.sequence(Pattern.java:1806)
    at java.util.regex.Pattern.expr(Pattern.java:1752)
    at java.util.regex.Pattern.compile(Pattern.java:1460)
    at java.util.regex.Pattern.<init>(Pattern.java:1133)
    at java.util.regex.Pattern.compile(Pattern.java:823)
    at scala.util.matching.Regex.<init>(Regex.scala:41)
    at scala.collection.immutable.StringLike$class.r(StringLike.scala:202)
    at scala.collection.immutable.StringOps.r(StringOps.scala:31)
    at com.tristanhunt.knockoff.ChunkParser.bulletLead(MarkdownParsing.scala:69)
    at com.tristanhu...
tristanjuricek commented 14 years ago

Ach, I was hoping the various platform definitions were abstracting away newlines. I'll check this out over the weekend. Thanks for letting me know.

notnoop commented 14 years ago

Unfortunately, that didn't quite work as expected. I used knockoff as part of a Lift application. When entering text using Firefox and/or Chrome (don't remember which now) on Mac OS X, I got \r\n newlines causing the problem to show up.

tristanjuricek commented 14 years ago

I think the major issues should be fixed with version 0.7.3-15.

The fix involved handling of empty lines, mostly. What's interesting is that saving some of the files locally in "DOS mode" worked fine, so, I'm not really that confident I got every possible edge case.