gjtorikian / commonmarker

Ruby wrapper for the comrak (CommonMark parser) Rust crate
MIT License
416 stars 80 forks source link

Parse tree missing text for code and code blocks #289

Closed rossta closed 1 month ago

rossta commented 1 month ago

Problem

Given a parsed document with default settings for markdown containing either code or code blocks:

code_doc = Commonmarker.parse("I have `some code` inline")

code_block_doc = Commonmarker.parse(<<~MD)
  ```ruby
    def wibble
      "wobble"
    end

MD


The parsed documents do not contain string content or text children on the code or code block nodes:

```ruby
code_doc.walk do |node|
  if node.type == :code
    puts node.inspect
  end
end
# => #<Commonmarker::Node(code): 
# source_position={:start_line=>1, :start_column=>9, :end_line=>1, :end_column=>12}
# >

code_block_doc.walk do |node|
  if node.type == :code_block
    puts node.inspect
  end
end
# => #<Commonmarker::Node(code_block): 
# source_position={:start_line=>1, :start_column=>1, :end_line=>5, :end_column=>3}, # # # # # # 
# fence_info="ruby"
# >

Expected Behavior

The parsed "code nodes" contain either string content or a set of children containing a text node with the string content:

  # => #<Commonmarker::Node(code): 
  # source_position={:start_line=>1, :start_column=>9, :end_line=>1, :end_column=>12}
+ # string_content=\"some code\"
  # >

  # => #<Commonmarker::Node(code_block): 
  # source_position={:start_line=>1, :start_column=>1, :end_line=>5, :end_column=>3}, # # # # # # 
  # fence_info="ruby"
+ # string_content=\"  def wibble\n    \"wobble\"\n ...\"
  # >

Thank you for Commonmarker! I'm excited to see the re-introduction of the AST behavior. Apologies if this is expected behavior and perhaps I'm misunderstanding the defaults.

kivikakk commented 1 month ago

You're right, there's no way to access it at all. There's an almost-complexity in that, in the underlying library, the string content is placed in a different place entirely, but there's no reason I can see that Commonmarker can't abstract that.

I'll get a PR up and see what @gjtorikian thinks!

gjtorikian commented 1 month ago

Not intentional at all! Thank you for reporting it.