Closed calebhearth closed 6 years ago
Unfortunately, the paragraphs are identified by content using an algorithm that makes sense for prose but probably not for code. That means this is likely to be a common issue for comment blocks attached to code snippets.
Seems like we may need a different or smarter algorithm? Code blocks are fairly easy to identify, so it seems possible to treat them differently.
Sent from my iPhone
On Jul 14, 2014, at 8:52 AM, pat brisbin notifications@github.com wrote:
Unfortunately, the paragraphs are identified by content using an algorithm that makes sense for prose but probably not for code. That means this is likely to be a common issue for comment blocks attached to code snippets.
— Reply to this email directly or view it on GitHub.
Additionally, the same paragraph would probably never appear twice in an article, and if it had I imagine that it would have a contextually different meaning and wouldn't make sense to have the same comments on.
Sent from my iPhone
On Jul 14, 2014, at 8:52 AM, pat brisbin notifications@github.com wrote:
Unfortunately, the paragraphs are identified by content using an algorithm that makes sense for prose but probably not for code. That means this is likely to be a common issue for comment blocks attached to code snippets.
— Reply to this email directly or view it on GitHub.
Idea: what if we added an index to the identifier for paragraphs with the same hash? Such as:
"Hello there" => "Ht1" "My friend" => "Mf1" "Hello there" => "Ht2" "Hi there" => "Ht3"
That will make it about as change-proof as it is now, but it will never put the same comment on two different paragraphs. It will also work well for code. Lastly, it will be easy to bring forward existing data (add a "1" to the end of every existing identifier), as opposed to a totally new hashing algorithm.
I think adding a new paragraph is far too likely to introduce an index and not have the algorithm look for partial matches if the one we have is not found. We want to add that anyone, but it may be the impetus for the change.
That being said, I think it might be more fruitful to explore using a different algorithm for code blocks than for prose.
I’d like to see some more examples of this in the real world to get a better sense of how urgent it is.
I've encountered the same issue while reading through this post: http://robots.thoughtbot.com/middleman-bourbon-walkthrough. There are a bunch of blocks that have "5 comments", when it's really only applicable to the first occurrence (at source/stylesheets/all.css.scss
header).
The only difference I see compared to Caleb's example is that it's only happening at each of the succeeding file name headers prior to each code block (instead of at the code blocks).
On this post, there are two code "paragraphs" which are being detected as the same paragraph, so the comments are showing up in both places. http://robots.thoughtbot.com/external-posts-in-jekyll