willbasky / hibet

Tibetan-English translator for CLI
BSD 3-Clause "New" or "Revised" License
13 stars 4 forks source link

Get rid of parsers-megaparsec #109

Closed willbasky closed 2 years ago

willbasky commented 2 years ago

It not maintained and keeps dependencies outdated.

willbasky commented 2 years ago

radixtree depend on Parsec. Either get rid of radixtree (swap it with another one) or fork and base radixtree on megaparsec or fork (or make PR) and update parsers-megaparsec

willbasky commented 2 years ago

To complete the task it wanted to swap Radixtree's search to lookup because search requires constraint from parsers-megaparsec while lookup doesn't.

search gives parsing of dirty strings, e.g.

> parseWylieInput radix "(balka)"
> parseWylieInput radix "(balkana)"
ExceptT (Identity (Right [([],[["balka"]])]))

it reads radixed words and drops non-radixed in anyway. Therefore dirty wylie text should parsed better beforehand. If it parsing of dirty wylie text not needed anymore, then 'lookup` will be enough. There are two approaches

  1. Add non-wylie chars and strings to sillabies and then parse (1.balka) -> ["(", "1", "." "balka", ")"] and then map lookup it and it becomes "(1.བལྐ)" or "(༡.བལྐ)", etc...
  2. Parse dirty text by separating non-wylie stuff from wylie (1.balka) -> ["(", "1", "." "balka", ")"] and then map lookupLemient (lookupLemient left non-wylie untouched) and convert wylie to tibetan script ("(1.བལྐ)" or "(༡.བལྐ)", etc...)
    
    lookupLenient :: RadixTree a -> Text -> Text
    lookupLenient radix t = case lookup radix t of
    Nothing -> t
    Just (t',_) -> t'

map (lookupLenient radix) ["(", "balka", ")"] ["(","balka",")"]


In both 1 and 2 a dirty text must be separated roughly,
hence first approach has no advantages,
and it is better to keep syllabies non-wylie-free
therefore second approach is right choice.

Steps of realization:

  1. Reduce current transforming to flat list. Some texts won't be consumed, and be dropped.
  2. Refactor current parsers to more correct separation against flat list.