Open RogerPodacter opened 3 months ago
Longer term it might be beneficial to force the user to specify the encoding.
Or we will implement actual types in Ruby 🤡
Will run the tests once I'm back in the office.
Thanks!! And would types really be so crazy here? The hex v. bytes ambiguity is just so brutal for our use-case!
Should we force everyone to use a special class? Definitely annoying but maybe worth it to have a decent night's sleep?
Here's what the AI spit out. Thoughts?
class ByteString
attr_reader :bytes
# Factory methods
def self.from_hex(hex_string)
# Remove leading '0x' if present
hex_string = hex_string[2..] if hex_string.start_with?('0x')
# Ensure it's a valid hex string and has an even number of characters
raise ArgumentError, "Invalid hex string" unless hex_string.match?(/\A[\h]*\z/)
raise ArgumentError, "Hex string length must be even" unless hex_string.length.even?
new([hex_string].pack('H*'))
end
def self.from_bin(binary_string)
raise ArgumentError, "Binary string expected with ASCII-8BIT encoding" unless binary_string.is_a?(String) && binary_string.encoding == Encoding::ASCII_8BIT
new(binary_string)
end
# Convert back to hex
def to_hex
@bytes.unpack1('H*')
end
# Convert to binary
def to_bin
@bytes
end
# Prevent calling to_s to avoid ambiguity
def to_s
raise NoMethodError, "Ambiguous conversion: Use `to_bin` for bytes or `to_hex` for hex representation"
end
# For debugging or string representation
def inspect
"#<ByteString hex=#{to_hex} length=#{@bytes.bytesize}>"
end
# Equality comparison
def ==(other)
other.is_a?(ByteString) && @bytes == other.bytes
end
# More strict equality comparison (used by Hash)
def eql?(other)
self == other
end
# Hash method to allow use in sets or as hash keys
def hash
@bytes.hash
end
private_class_method :new
# Initialize with binary string (bytes)
def initialize(binary_string)
raise ArgumentError, "Binary string must have ASCII-8BIT encoding" unless binary_string.encoding == Encoding::ASCII_8BIT
@bytes = binary_string.dup.freeze
end
end
Looking at
Util.prefixed? arg
to determine whether to treat something as hex makes it impossible to encode anything that starts with the bytes[48,120]
, which is not that infrequent.Also, I don't think we get any benefit from this part of the conditional anyway since if the string isn't both hex chars and 2x the input size things will fail anyway.
Longer term it might be beneficial to force the user to specify the encoding.
Thanks!