Position of RBS::Location should be byte length

ksss commented 7 months ago

I'm developing a tool using RBS. And I'm trying to load RBS files using RBS::MethodType#location and display them with colors.

During development, a problem occurred where the display is off by one character only in core/math.rbs.

In core/math.rbs, the character π is used in the documentation, which is a 2 byte character.

Upon investigation, it appears that the positions in Location are recorded not in byte size but in the number of UTF-8 characters.

# Whether the comment is `π` or `p`, the class location has not changed.

::RBS::Parser.parse_signature("#π\nclass Foo\nend")[2][0].location.start_pos
#=> 3
::RBS::Parser.parse_signature("#p\nclass Foo\nend")[2][0].location.start_pos
#=> 3

Recording the position in bytesize would make it more convenient for reading with methods like IO#read, so I believe it should be recorded in bytesize.

What do you think?

soutaro commented 7 months ago

I think we need both of character position and byte position, based on the use cases. Storing byte positions and calculate character position when we need, like Prism, might be a good way.

pocke commented 7 months ago

I'm concerned about that it introduces performance degradation if RBS calculates a character position from a byte position. Currently, RBS uses only character positions at the Ruby level; therefore, it will need the calculations for every RBS::Location usage.

ksss commented 7 months ago

We can use a workaround to temporarily avoid the issue by utilizing methods like start_loc and end_loc to ignore the effects of comments.

ruby / rbs

Position of RBS::Location should be byte length #1814