Stenway / RSV-Challenge

RSV = Rows of String Values
https://www.stenway.com
Other
90 stars 15 forks source link

RSV in Elixir #3

Open dcoai opened 8 months ago

dcoai commented 8 months ago

Here is a quick attempt at RSV in Elixir:

defmodule RSV do                                                                                                                 
  @end_of_rec  <<0xFD>>                                                                                                          
  @null        <<0xFE>>                                                                                                          
  @end_of_item <<0xFF>>                                                                                                          

  def encode(list), do: Enum.reduce(list, <<>>, fn row, acc -> acc <> encode_row(row) end)                                       
  defp encode_row(row), do: Enum.reduce(row, <<>>, fn item, acc -> acc <> encode_item(item) <> @end_of_item end) <> @end_of_rec  
  defp encode_item(nil), do: @null                                                                                               
  defp encode_item(val) when is_binary(val), do: val                                                                             

  def decode(data), do: data |> split(@end_of_rec, "document") |> Enum.map(&decode_row/1)                                        
  defp decode_row(row), do: row |> split(@end_of_item, "row") |> Enum.map(&decode_item/1)                                        
  defp decode_item(@null), do: nil                                                                                               
  defp decode_item(val), do: val                                                                                                 

  defp split(data, seperator, component) do                                                                                      
    String.ends_with?(data, seperator) || raise "Incomplete RSV #{component}"                                                    
    String.split(data, seperator) |> Enum.drop(-1)                                                                               
  end                                                                                                                            
end

and some unit tests:

defmodule RSVTest do                                                                                                             
  use ExUnit.Case                                                                                                                
  doctest RSV                                                                                                                    

  def hello_world_rsv, do: <<72, 101, 108, 108, 111, 255, 240, 159, 140, 142, 255, 254, 255, 255, 253>>                          

  def hello_world, do: [["Hello", "🌎", nil, ""]]                                                                                

  test "greets the world" do                                                                                                     
    assert RSV.encode(hello_world()) == hello_world_rsv()                                                                        
  end                                                                                                                            

  test "decodes the world" do                                                                                                    
    assert RSV.decode(hello_world_rsv()) == hello_world()                                                                        
  end                                                                                                                            

  test "encodes and decodes" do                                                                                                  
    assert RSV.encode(hello_world()) |> RSV.decode() == hello_world()                                                            
  end                                                                                                                            
end                                                                                                                              
Stenway commented 8 months ago

Awesome :-) My Elixir understanding currently only goes to the extend of Fireships "Elixir in 100 Seconds" video.

Could you also try to check the 79 valid and 29 invalid test files in this directory? https://github.com/Stenway/RSV-Challenge/tree/main/TestFiles

I always like to check against these test files, to know if everything is working: https://github.com/Stenway/RSV-Challenge/blob/main/Python/rsv.py#L153

If it would comply and you don't have a problem with the MIT-0 license, we could surely add it to repository.

Thanks for your valuable input.

gausby commented 8 months ago

Here is my solution in Elixir:

defmodule RSV do
  @moduledoc """
  Documentation for `RSV`.
  """

  @terminate_value 0xFF
  @terminate_row 0xFD
  @null 0xFE

  @doc """
  Encode data containing list of lists of Strings and Nil
  """
  def encode(data_rows) when is_list(data_rows) do
    for row <- data_rows, into: <<>> do
      <<for(value <- row, into: <<>>, do: encode_value(value))::binary, @terminate_row>>
    end
  end

  defp encode_value(nil), do: <<@null, @terminate_value>>
  defp encode_value(value), do: <<value::binary, @terminate_value>>

  @doc """
  Decode RSV encoding data
  """
  def decode(<<data::binary>>), do: do_decode(data, [], [], [])

  defp do_decode(<<>>, [], [], acc), do: Enum.reverse(acc)

  defp do_decode(<<@terminate_row, rest::binary>>, [], current_row, acc) do
    do_decode(rest, [], [], [Enum.reverse(current_row) | acc])
  end

  defp do_decode(<<@terminate_value, rest::binary>>, current_value, current_row, acc) do
    value = current_value |> Enum.reverse() |> IO.iodata_to_binary()
    do_decode(rest, [], [value | current_row], acc)
  end

  defp do_decode(<<@null, @terminate_value, rest::binary>>, [], current_row, acc) do
    do_decode(rest, [], [nil | current_row], acc)
  end

  defp do_decode(<<char, rest::binary>>, value_acc, row_acc, acc) do
    do_decode(rest, [char | value_acc], row_acc, acc)
  end
end

It will complain loudly if the input data is invalid—which is good. Perhaps it performs too many Enum.reverse/1 operations in the decoder, and it doesn't do streaming.

gausby commented 8 months ago

I tried the implementation I mentioned yesterday (https://github.com/Stenway/RSV-Challenge/issues/3#issuecomment-1896646727) against the test fixtures, and it didn't pass the invalid cases—so I added a test setup that load the test fixtures from the /TestFiles/ directory (may it never move), and fixed the errors. I turned it into a PR.

@Stenway You are welcome to the code, and the MIT license is fine with me.