version 0.2.0 iteration plan

Estimated Release Date: 3/12 Release Manager: @suiguoxin Schedule:

Features

[x] P0 Feature Planning @iofu728 @lunaqiu ETA: 1.16
[x] P0 Interface Definition: Engine < Core < Wrapper < Applications #52 @SiyunZhao ETA: 1.16
[x] P0 Layered refactor @SiyunZhao @iofu728
[x] fixed/customized/accurate/target/max compression ratio
- [x] P0 doc @SiyunZhao #69
[x] P1 Support customized compression spec, such as user specified segment boundary and compression ratio
- [x] P0 Support using <llmlingua ratio=?? compress=??> </llmlingua> to identify compression segment boundaries
- [x] Support preserving essential characters
[x] bug fix TBD After Interface Refactoring
- [x] P1 #50
[x] P1 Support more models, small LMs e.g., Phi2 ETA: 2 days #67
[x] P0 Support pure json interface & doc @SiyunZhao #120
[x] PR (Ch, 1000 words) @lunaqiu @iofu728 1.17

[ ] P1 exp: target comp ratio v.s. real comp ratio on specific data
[ ] >token level, < sentence level, list different mappings and design interface P1 word level compression #4
[ ] P1 Support more / faster engines #41, including llama_cpp, FasterTransformer, vLLM ETA: TBD
- [ ] survey which engines to support
[ ] P2 Documentation and examples
- [ ] Supported models and experiment results (with compressor throughput) after a faster engine supported