pemistahl / grex

A command-line tool and Rust library with Python bindings for generating regular expressions from user-provided test cases
https://pemistahl.github.io/grex-js/
Apache License 2.0
7.23k stars 174 forks source link

Treat diffs as separate groups #122

Open suliveevil opened 2 years ago

suliveevil commented 2 years ago

For example:


<iframe src="//player.bilibili.com/player.html?aid=303065226&bvid=BV1dP411n7bc&cid=833485551&page=1" scrolling="no" border="0" frameborder="no" framespacing="0" allowfullscreen="true"> </iframe>

<iframe src="//player.bilibili.com/player.html?aid=261233537&bvid=BV1xe411j7EQ&cid=851171461&page=1" scrolling="no" border="0" frameborder="no" framespacing="0" allowfullscreen="true"> </iframe>

<iframe src="//player.bilibili.com/player.html?aid=558528772&bvid=BV1Ee4y1r7wX&cid=848823074&page=1" scrolling="no" border="0" frameborder="no" framespacing="0" allowfullscreen="true"> </iframe>

<iframe src="//player.bilibili.com/player.html?aid=455751094&bvid=BV1U5411s7RU&cid=383073940&page=1" scrolling="no" border="0" frameborder="no" framespacing="0" allowfullscreen="true"> </iframe>

diff:

aid=303065226&bvid=BV1dP411n7bc&cid=833485551
aid=261233537&bvid=BV1xe411j7EQ&cid=851171461
aid=558528772&bvid=BV1Ee4y1r7wX&cid=848823074
aid=455751094&bvid=BV1U5411s7RU&cid=383073940

regex:

aid=([0-9]+)&bvid=([0-9a-zA-Z]+)&cid=([0-9]+)

current output grex -f grex.txt -g:

<iframe src="//player\.bilibili\.com/player\.html\?aid=(((261233537&bvid=BV1xe411j7EQ&cid=85117146|303065226&bvid=BV1dP411n7bc&cid=83348555)1|455751094&bvid=BV1U5411s7RU&cid=383073940)|558528772&bvid=BV1Ee4y1r7wX&cid=848823074)&page=1" scrolling="no" border="0" frameborder="no" framespacing="0" allowfullscreen="true"> </iframe>$
suliveevil@swy-M1 ~ % grex -f grex.txt -r
^<iframe src="/{2}player\.(?:bili){2}\.com/player\.html\?aid=(?:(?:45{2}751094&bvid=BV1U541{2}s7RU&cid=383073940|(?:26123{2}537&bvid=BV1xe41{2}j7EQ&cid=851{2}7146|(?:30){2}652{2}6&bvid=BV1dP41{2}n7bc&cid=83{2}485{3})1)|5{2}85287{2}2&bvid=BV1Ee4y1r7wX&cid=848{2}23074)&page=1" scrol{2}ing="no" border="0" frameborder="no" framespacing="0" al{2}owful{2}scre{2}n="true"> </iframe>
截屏2022-10-04 05 23 50

expected output:

<iframe src="//player\.bilibili\.com/player\.html\?aid=([0-9]+)&bvid=([0-9a-zA-Z]+)&cid=([0-9]+)&page=1" scrolling="no" border="0" frameborder="no" framespacing="0" allowfullscreen="true"> </iframe>
截屏2022-10-04 05 24 20

https://github.com/pemistahl/grex/issues/48

pemistahl commented 1 year ago

Thank you for this feature request, @suliveevil. As you stated, this kind of feature has already been requested in #48. I think I will try to implement it for the upcoming version 1.5.0 because it does not look too difficult at first glance.

suliveevil commented 1 year ago

Looking forward to version 1.5