typora / typora-issues

Bugs, suggestions or free discussions about the minimal markdown editor — Typora
https://typora.io
1.54k stars 56 forks source link

Non-breaking space `NBSP` added after typing inline markup if ENTER is not also typed. #5211

Open vassudanagunta opened 2 years ago

vassudanagunta commented 2 years ago

If you manually type any inline markup, e.g. emphasis (*), strong (**), strikeout (~~), highlight (==), code (`) or links ([]()), followed by a space (as is normal), and you don't press ENTER at the end of the line, the space following the closing tag of the markup will be a non-breaking space ( ) rather than a regular space. I am only aware of this because my IDE shows visible special characters such as NBSP.

This happens consistently and so is easy to reproduce:

Type *this line* but **do not** press `ENTER` at the end and ==save== it.

If you use a tool that shows special invisible characters, you will see four NBSP characters. If you then type ENTER on that line, whether you do it immediately when typing the line, or you go back to the line later, all the NBSP characters will be replaced by a regular space.

But very often you need to edit an existing line, inserting new text that includes an inline markup. When you make such edits, you don't type ENTER, so this bug happens frequently.

This only happens while typing in hybrid mode. There is no problem in source mode.

This is a problem for two reasons:

  1. It's wrong for NBSP to be entered when only a regular space was typed.
  2. Later, in the future, if edit the line (which is often a long paragraph), the NBSP chars are removed, resulting in spurious diff changes and noisy commit history in Git.

OS: macOS 12.3.1 Typora Version: 1.2.4

elliottslaughter commented 2 years ago

I'm seeing the same issue; same macOS and Typora version.

However, I don't think it's just related to inline code spans. I can't find any easy reproducers right now, but I've been seeing this for weeks at least, and most of my documents do not contain inline code. I'll type medium to long document, open it in Emacs, and Emacs underlines the NBSP characters. Alternatively, you can run the Markdown through iconv. I just did this with your code sample:

$ iconv -f ASCII nbsp_test.md
...
Type `this line` but do not press `ENTER`
iconv: nbsp_test.md:12:41: cannot convert

(This was at the end of a longer document where I was trying to reproduce, thus why the line number is off.)

I've checked all my OS settings and can't see any reason why the OS would be causing NBSPs to be inserted, so I think it has to be on Typora's end somehow.

abnerlee commented 2 years ago

So the NBSP is found in source file, or from copying the content in hybrid editing mode?

vassudanagunta commented 2 years ago

As described above, it happens when you type. It has nothing to do with copying. Please see the "very easy to reproduce" steps in the description.

Are you not able to reproduce this? I added the following to the description, though I think you should assume bug reports are about hybrid mode since that's why people use Typora 😉

This only happens while typing in hybrid mode. There is no problem in source mode.

abnerlee commented 2 years ago

I mean you cannot "see" the nbsp, so how do you tell if there is NBSP? You find it in file source content or when paste content into other apps?

elliottslaughter commented 2 years ago

I'm not sure how @vassudanagunta did it, but speaking for myself, I saved the file with Cmd-S and then looked at the contents of the file that Typora saved to disk. (You can see my iconv command up above, but anything that has the ability to look at bytes/raw contents should be able to do it.)

vassudanagunta commented 2 years ago

@abnerlee My IDE, WebStorm, shows them in its editor (see attached image) and also in its diff view. As you can see, the last line was typed as per my repro steps above, you can see the two NBSP characters that follow each inline span.

I work on various open source projects where documentation coexists with the code, so document files are frequently opened in the IDE, and are always opened in a diff viewer prior to checkin.

Screen Shot 2022-05-04 at 4 15 50 PM
elliottslaughter commented 2 years ago

I found another sequence that replicates this issue.

Open a new/empty file in Typora and type the following, and then save (e.g., as test_typora.md).

asdf [qwer](https://asdf) asdf

Now run it through iconv:

$ iconv -f ASCII test_typora.md 
asdf [qwer](https://asdf)
iconv: test_typora.md:1:25: cannot convert

Or alternatively, open the file in a text editor with the ability to visualize NBSP characters (Emacs can do this).

I'm on macOS 12.3.1 and Typora 1.2.4.

vassudanagunta commented 2 years ago

I found another sequence that replicates this issue.

Confirmed. This will happen for any link you type, just like for inline code spans, if you do not press ENTER at the end of the line containing the link.

Not only that, it happens for ANY kind of inline span, including emphasis (*), strong (**), strikeout (~~) and highlight (==).

I knew I've seen issues for cases other than code spans, but couldn't remember when i wrote this issue. Thank you!

I will updated the issue to reflect this.

elliottslaughter commented 2 years ago

I continue to run into this, including in situations that are not adequately explained by the most recent hypothesis. In documents that I write, I often get 2, 3 or more NBSP characters in a document of approximately 1000-2000 words. This is kind of frustrating because every time I write something, I need to post-process it in another editor to remove the spurious NBSPs. I basically can't trust the output of Typora to be clean anymore.

I think what this implies that it there is somehow a way for NBSP to get into the document despite ENTER being pressed. Because obviously I am not writing all my text in one big paragraph, and the NBSPs do not occur exclusively in the last paragraph.

However, because my documents are large, it's a little hard to diagnose a root cause. I'm not necessarily writing everything linearly, so it's not necessarily the case that seeing the final result file would be helpful.

I wonder if it would be possible to add a debug mode that tracks key presses, particularly correlated with what characters they put into the document? Then I can look at the log and see if I can correlate it with the bad pattern of behavior.

vassudanagunta commented 2 years ago

The most efficient thing to do is to fix the problem for the known repro steps. There's a good chance fixing that problem will fix the others.

elliottslaughter commented 11 months ago

To make this easier to catch, here's an even more minimal reproducer:

  1. Open Typora to a new file.
  2. Type the following: *A* a. *A* b. . (Just to be pedantic, that's: star, capital-A, star, space, lowercase-a, period, space, star, capital-A, star, space, lowercase-b, period.)
  3. Do NOT press enter.
  4. Save the file.

For posterity, I am attaching a copy of the file that is created here: bug5211.md

To confirm that the file contains Unicode code point U+00A0, you can run ripgrep:

$ rg -c '\u{00A0}' bug5211.md 
1

You can also directly confirm the contents of the file contains non-ASCII characters:

$ hexdump -C bug5211.md 
00000000  2a 41 2a c2 a0 61 2e 20  2a 41 2a 20 62 2e        |*A*..a. *A* b.|
0000000e
$ iconv -f ASCII bug5211.md
iconv: iconv(): Illegal byte sequence

For comparison you can directly check the equivalent pure-ASCII character sequence by just typing it into the command line:

$ echo -n '*A* a. *A* b.' | hexdump -C
00000000  2a 41 2a 20 61 2e 20 2a  41 2a 20 62 2e           |*A* a. *A* b.|
0000000d
$ echo -n '*A* a. *A* b.' | iconv -f ASCII
*A* a. *A* b.

I am currently on macOS 14.2.1 with Typora 1.7.6 (7018).

abnerlee commented 11 months ago

We can found the nbsp in your attached md file, but when I type "(Just to be pedantic, that's: star, capital-A, star, space, lowercase-a, period, space, star, capital-A, star, space, lowercase-b, period.)" it is fine.

Do you use any IME or non-English keyboard layout?

vassudanagunta commented 11 months ago

@abnerlee are you saying you've never reproduced this? I get this consistently since I first reported it 20 months ago, on every version of Typora I've used and even after I got a new Mac.

The key thing is DO NOT PRESS ENTER on the line you type this.

The other possibility is it only happens for some combination of Typora settings.

This issue and #5228 make it a headache to use Typora on git managed content, as it introduces changes that have to be undone before code review.

elliottslaughter commented 11 months ago

For what it's worth, I'm using a MacBook Pro 2020 model with the built-in keyboard. My keyboard and language are set to U.S. English, as you can see in the screenshot below.

Screenshot 2024-01-07 at 1 21 43 PM

I can't recall having set anything that would have impacted this. I have never been into any of these dialogues prior to checking this.

I would be happy to share my Typora settings and/or test after resetting to defaults, assuming that I can recover my current settings again afterward.

vassudanagunta commented 11 months ago

Likewise. I just use English and don't have anything installed that affects typing.

Screenshot 2024-01-07 at 4 55 56 PM Screenshot 2024-01-07 at 4 56 40 PM Screenshot 2024-01-07 at 4 56 58 PM
elliottslaughter commented 11 months ago

I think I found it:

Screenshot 2024-01-07 at 1 41 44 PM

The key setting is When Writing: Ignore whitespace and single line break.

With When Writing: Ignore whitespace and single line break, the problem reproduces.

With When Writing: Preserve whitespace and single line break, the problem does NOT reproduce.

After starting fresh with a new user account and fresh download of Typora, that seems to be the one setting that controls whether this reproduces or not.

vassudanagunta commented 11 months ago
  1. I confirmed @elliottslaughter last observation. It doesn't happen if I switch to When Writing: Preserve whitespace and single line break.

  2. Another important repro condition. Not only must you not press ENTER, you must not move the cursor to another line before saving/closing the file.

    This happens frequently when editing just one sentence, e.g. adding bold or inline code.

    So, to be very clear, you can reproduce it with this very common edit:

    1. Go to any existing line in a file with the intent to add some bold text or inline code.
    2. Place the cursor at the beginning of a word.
    3. Type some bold text or inline code, pressing space after the closing markup to separate it from the following word. DO NOT PRESS ENTER. DO NOT MOVE THE CURSOR TO ANOTHER LINE.
    4. Close the file.

Now examine content, e.g. a git diff or an editor like my IDE (WebStorm) that shows special characters.

elliottslaughter commented 10 months ago

I have now found a sequence of keypresses that reproduces even on the default configuration of Typora. I believe this means that we are back to having no workarounds for this issue, at least for the sequence of keypresses below. This example is also quite short.

The text is:

> 1

That is: ampersand, lowercase-g, lowercase-t, semicolon, space, number-1.

Save the file via any method (menu or Cmd-S).

Note: for this test, it actually does not matter if you press enter or not. You can if you want, it will make no difference either way. The output below is assuming you do not press enter.

Test result:

$ iconv -f ASCII test16.md 
iconv: iconv(): Illegal byte sequence
$ hexdump -C test16.md
00000000  26 67 74 3b c2 a0 31                              |>..1|
00000007

The exact HTML entity does not matter, I only picked > because it is short. I originally noticed the issue with ×.

On macOS 14.2.1 and Typora 1.8.5.