Closed flying-sheep closed 13 years ago
All whitespace is normalized to spaces prior to parsing a document based on the value assigned to tab_length
(default is 4). The tabs are not retained. If you want the tabs to be represented differently, you will need to assign the appropriate value to tab_length
. If you would like your tabs restored, I'd suggest running your own postprocessor which replaces the appropriate number of spaces with a tab.
I should note that if you are using the current code in the repo, tab_length
is a keyword argument on the markdown class and the wrapper function (do markdown.markdown(some_text, tab_length=8)
). However, in previous versions of markdown it was a global variable markdown.TAB_LENGTH
which you would have to override.
i don’t quite understand why they are converted at all. To normalize it? then tabs would be more sensible (1 tab = 1level of indentation, ¾ less bytes used)
Either way, it should support tab_length=None
to use tabs instead of spaces. In the Future, we can use CSS3’s tab-size
, too.
First, this is a Python implementation of the original perl implementation by John Gruber. It is noteworthy that John's implementation replaces tabs with spaces as well. We are copying that behavior and are not likely to change unless he does (very unlikely).
That said, here are my responses to your specific comments:
If you could guarantee that every document author consistently always used either one of spaces or tabs, there would be no need for normalizing whitespace. However, that would be an unrealistic expectation. Especially on documents edited by multiple people (wiki pages?). By normalizing whitespace to use all spaces, we eliminate a lot of potential edge case bugs (in those inconsistent documents).
True, we could normalize to tabs, but there are actually a number of peculiarities to the Markdown syntax which makes spaces easier to work with (we often find tab_length - 1
in the code for example). Which brings up another problem; if you set tab_length=None
, and the parser finds a string of spaces, how many tabs are represented there?
tab_length = None
should only mean “Tabs aren’t to be normalized”, but now i understand that the code doesn’t separate between syntax and output. Instead of normalizing all the tabs to x spaces, then interpreting x-space-indented blocks as codeblocks, whe should adhere to the specification and interpret blocks which are either indented by x spaces or 1 tab as code blocks. individual blocks indented with both spaces and tabs are an abomination ;)
we could do this easily by converting the first tab of each line into x spaces (while retaining the following ones) and then using the current code-block-finding code. this would even work for aforementioned abominations. afaik there aren’t nested code blocks in markdown.
PS: i only found tab_length being used in blockprocessors.py, where else is it used?
so, what’s going on?
i really want my code to be retained as it is and not changed. e.g. when coding in genie, converting tabs to spaces introduces syntax errors (you have to specify in this languages how many spaces one indentation level should have. without this specification, it’s 1 tab per indentation level)
everything inside a code block shouldn’t be touched.
if you look at the diff of that pull request ignoring whitespace (?w=1
), you can see that the tests were not altered in any other way than whitespace, and still run flawlessly:
I am trying to put code from a Makefile into a Markdown preformatted block and the normalization makes the resulting code invalid, since tabs are semantic in Makefiles.
It is impossible to fix this in a post-processing step because I also often include code snippets - in Make any other languages - that have eight or more spaces in a row.
In fact the Markdown "standard" says:
Regular Markdown syntax is not processed within code blocks.
I'd count whitespace folding/normalize as "regular Markdown syntax". It shouldn't be happening - code blocks should deindent and HTML entity escape, and nothing else.
This is simply wrong. It generates invalid output for Makefiles. Tabs should only be expanded for Markdown syntax, not inside code blocks.
the code
→→code
(where→
is a tab) is expanded into this:why aren’t my tabs retained?