Clop parser is monitoring all the new posts about part1/2/3... however it's not really monitoring when a new ransomware attack occurred. I think it makes more sense to get the list of companies from the top of the page and not the posts about parts being published.
Per example, last entry with the current parser was added 2023-03-16, but according to my other monitors, this attack was already listed two days ago, on 2023-03-14.
This issue is open to discussion, but having the new attacks monitored instead the files being published tend to make more sense (on all the other groups it's not monitoring the new files added, only the new victims added).
Regex for the parser:
grep 'g-menu-item-title' source/clop-*.html --no-filename | sed -e s/'<span class="g-menu-item-title">'// -e s/"<\/span>"// -e 's/^ *//g' -e 's/[[:space:]]*$//' -e 's/^ARCHIVE[[:digit:]]$//' -e s/'^HOW TO DOWNLOAD?$'// -e 's/^ARCHIVE$//' -e 's/^HOME$//' -e '/^$/d'
There was already a similar issue #18 and this parser above would solve the /stats too.
Ii've removed the previous part records for clop from posts.json
this has removed 1136 otherwise duplicate entries which should be reflected at next run
thanks again
Clop parser is monitoring all the new posts about part1/2/3... however it's not really monitoring when a new ransomware attack occurred. I think it makes more sense to get the list of companies from the top of the page and not the posts about parts being published.
Per example, last entry with the current parser was added 2023-03-16, but according to my other monitors, this attack was already listed two days ago, on 2023-03-14.
This issue is open to discussion, but having the new attacks monitored instead the files being published tend to make more sense (on all the other groups it's not monitoring the new files added, only the new victims added).
Regex for the parser:
grep 'g-menu-item-title' source/clop-*.html --no-filename | sed -e s/'<span class="g-menu-item-title">'// -e s/"<\/span>"// -e 's/^ *//g' -e 's/[[:space:]]*$//' -e 's/^ARCHIVE[[:digit:]]$//' -e s/'^HOW TO DOWNLOAD?$'// -e 's/^ARCHIVE$//' -e 's/^HOME$//' -e '/^$/d'
There was already a similar issue #18 and this parser above would solve the /stats too.