libgit2 / rugged

ruby bindings to libgit2
MIT License
2.25k stars 277 forks source link

Rugged::Diff::Delta #new_file / #old_file method got a wrong encoded [:path] string #888

Open HM2468 opened 3 years ago

HM2468 commented 3 years ago

when I call new_file[:path] / old_file[:path] of Rugged::Diff::Delta, I got result as follow:

  [{:path=>"test/.keep", :type=>"added"},
   {:path=>"test/\xE4\xB8\xAD\xE6\x96\x87\xE5\x90\x8D\xE6\xB5\x8B\xE8\xAF\x95\xE4\xB8\x80.txt", :type=>"added"},
   {:path=>"test/\xE4\xB8\xAD\xE6\x96\x87\xE5\x90\x8D\xE6\xB5\x8B\xE8\xAF\x95\xE4\xB8\x89.txt", :type=>"added"},
   {:path=>"test/\xE4\xB8\xAD\xE6\x96\x87\xE5\x90\x8D\xE6\xB5\x8B\xE8\xAF\x95\xE4\xBA\x8C.txt", :type=>"added"},
   {:path=>"\xE4\xB8\xAD\xE6\x96\x87.  \xE5\x91\xBD\xE5\x90\x8D1.txt", :type=>"added"}]}

when I copy the path value to my console pry/irb, I got the right string

It seems to be an incorrectly decoded problem

[66] pry(main)> "test/\xE4\xB8\xAD\xE6\x96\x87\xE5\x90\x8D\xE6\xB5\x8B\xE8\xAF\x95\xE4\xB8\x80.txt"
=> "test/中文名测试一.txt"
[67] pry(main)> "test/\xE4\xB8\xAD\xE6\x96\x87\xE5\x90\x8D\xE6\xB5\x8B\xE8\xAF\x95\xE4\xB8\x89.txt"
=> "test/中文名测试三.txt"
[68] pry(main)> "test/\xE4\xB8\xAD\xE6\x96\x87\xE5\x90\x8D\xE6\xB5\x8B\xE8\xAF\x95\xE4\xBA\x8C.txt"
=> "test/中文名测试二.txt"
carlosmn commented 3 years ago

The escape codes seem to be what you should be expecting. Git paths are arbitrary binary bytestrings so it looks like whatever you're using to print out the structure is erring on the side of caution and showing anything non-ascii like that.

Does it still happen if you tag the strings with whatever the right encoding is for your repository?