haystack / murmur

A mailing list designed to reduce noise and encourage sharing
25 stars 13 forks source link

Extracting hash tag is really buggy #261

Open soyapark opened 5 years ago

soyapark commented 5 years ago

https://github.com/haystack/murmur/blob/3983477571d211a20109b7d53f853f9581120223/engine/constants.py#L24-L34

This is very buggy. This function checks simple regex of checking strings starting with #. However, this function is given html text and it might lead to adding a bunch of unindented tags (e.g. ID of html element and hex color code) We might want to take plain texts instead or come up with a stronger regex Worth to refer to this old issue #27.