seth-brown / formd

A Markdown formatting tool
MIT License
147 stars 19 forks source link

Markdown-links starting at start of line truncate following text #9

Closed derdennis closed 11 years ago

derdennis commented 11 years ago

After my comment to Issue #4 I updated to the latest version of formd, commit be5d17f237dddd1970c6d94eaf175fa7c5378167 and this fixed all my footnote related problems. Thank you again for your quick fix.

Now I noticed a problem with the link conversion in a rather lenghty post of mine. Whenever a link is at the start of a line, the following body text gets truncated. Putting any character in front of the link returns the full functionality of formd.

Let me demonstrate with this example file formd_test_file.markdown:

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum
dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero
eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet.

[Link to google][1]

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum
dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero
eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet.

[Link to formd][2]

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum
dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero
eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet.

[1]: http://google.com
[2]: http://drbunsen.github.com/formd/

Output of cat formd_test_file.markdown | formd -r:

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum
dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero
eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet.

[1]: [1]
[2]: http://drbunsen.github.com/formd/ 

Output of cat formd_test_file.markdown | formd -i:

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum
dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero
eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet.

All text after the first link gets truncated. In case of formd -r the reference at the bottom of the file gets turned into [1]: [1]. This is, how I first noticed the problem, because I write my Markdown links inline style and convert them occasionally to the reference style.

Putting a space before the first link, pushes the problem down to the next link in the file and is still killing the first link:

cat formd_test_file.markdown | formd -r:

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum
dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero
eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet.

 [Link to google][1]

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy
eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum
dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero
eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet.

[1]: [1]
[2]: http://drbunsen.github.com/formd/

In the above output the second link and the complete third paragraph are missing. The reference of the second link is still there, though...

Maybe this is somehow related to Issue #6?

I checked it against the formd version from the former commit 4b8ab19b50025fa1e20f5f20919841a1d56dc250 and this worked fine. But misses, of course, the footnote support...

I diffed the two versions and saw, that the differences are only related to the self_match_links and self_match_refs part of formd. Yeah, makes sense.

However, it seems that my Python-Fu is not strong enough to come up with a regex, which makes footnotes and links starting at the beginning of a line working.

Would you be so kind, to have another look at this fine piece of software? Thank you very much for your time and efforts.

Dennis

seth-brown commented 11 years ago

Dennis, thanks again for the detailed writeup of the bug. Can you test all my code?

As you rightly pointed out, the issue was with the regex. Specifically the bug was with multi-line spanning links.

[Here's a link
crossing a line](www.example.com)

The bug should be fixed now. Let me know if you have any more issues.

derdennis commented 11 years ago

Mhm, I'm afraid it does not seem to work.

Look, I updated to the latest commit b9faf65911e48c4c68e53bdcb9842f5cc66d5e3e and used the same example file, formd_test_file.markdown with two links, each starting at the beginning of their line:

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum
dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero
eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet.

[Link to google][1]

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum
dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero
eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet.

[Link to formd][2]

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum
dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero
eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet.

[1]: http://google.com
[2]: http://drbunsen.github.com/formd/

Output of cat formd_test_file.markdown | formd -r:

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum
dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero
eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet.

[1]: [1]
[2]: http://drbunsen.github.com/formd/ 

Output of cat formd_test_file.markdown | formd -i:

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum
dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor
invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero
eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet.

Exact same behaviour as before... Putting a space in front of the links returns to expected results.

I looked at your changes and it seems as if you didn't touch the regex itself of your code. Only the DOTALL option was removed:

- re.DOTALL | re.MULTILINE)
+ re.MULTILINE)

as well as some trailing whitespace at the end of some lines. Maybe you just forgot to alter the regex as well? The links in my testfile are not spread over multiple lines after all...

Regarding your other code: Just send it my way...

Thank you once again for your time.

seth-brown commented 11 years ago

Thanks for your patients Dennis. I've updated formd and I can now confirm that it works correctly with the example markdown you provided. Thanks again for the help and let me know if you have any more difficulties.

derdennis commented 11 years ago

No worries, thank you for the quick fix.

Now it works like a charme with the provided example and real life markdown including footnotes.

I'm a happy camper.