chmllr / notehub

A pastebin for markdown pages.
MIT License
273 stars 33 forks source link

[deriveTitle] The tittle should be the first line (stripped out special chars)? #43

Closed samdx closed 7 years ago

samdx commented 7 years ago

Hi,

Idea: The new note's title will be the first line which is stripped out all of special characters

The header (Markdown: #, ##) is the title. This keeps all non-ascii, latin-based, CJK chars, ... but not the special symbol such as: #, $ and so on. These chars will be replaced by -.

What about this?

The original:

var deriveTitle = text => text
  .split(/[\n\r]/)[0].slice(0,25)
  .replace(/[^a-zA-Z0-9\s]/g, "");

Changed to:

var deriveTitle = text => text
  .split(/[\n\r]/)[0]
  .replace(/[`~!@#$%^&*_|+=?;:'",.<>\{\}\\\/]/gi, "-");
chmllr commented 7 years ago

Hi, first thing's first: which problem are we solving? :)

samdx commented 7 years ago

@chmllr The original method strips out all the non-ascii char, even Unicode ones such as Latin-related, Vietnamese, CJK, ...

Here is an example:

# Con cáo nâu nhanh nhẹn nhảy qua con chó lười biếng

It becomes:

Con c o n u nhanh nh n nh y qua con ch l i bi ng

I think it should be:

Con cáo nâu nhanh nhẹn nhảy qua con chó lười biếng

More semantic.

chmllr commented 7 years ago

Thanks a lot for pointing out this issue! Ironically, this stripping had to be very strict as it is the legacy code, which was necessary when original note titles where used as note urls.

On the weekend, I will convert markdown to plain text and take the first line or the first 25 chars.

chmllr commented 7 years ago

Fixed in b2a0f2ae6efab73cc8a5b52863a501052fa8e19f