digininja / CeWL

CeWL is a Custom Word List Generator
1.96k stars 258 forks source link

Progress Indication #71

Closed zedseven closed 3 years ago

zedseven commented 3 years ago

Pointing this tool at bigger sites/wikis can take quite a long time, and it can be difficult to be sure it's actually doing something until it finishes or I force-kill it.

It would be very useful to have some kind of progress indication, whether that would be something simple like displaying the current page being processed, a count of words found, etc. This would make it much easier to be sure it's working and to gauge how far along it is.

Perhaps this issue is moreso about it being slow with the way I'm running it, but pointing it at something like a wikipedia article with default depth took over a day and I ended up killing it. (I later realized a depth of 2 was likely too high, but the point remains)

digininja commented 3 years ago

I can't do a progress bar as I've no idea how big the site is before I start the spider.

You can dump out all the page information already with the -v flag.

On Sun, 24 Jan 2021 at 21:44, Zacchary Dempsey-Plante < notifications@github.com> wrote:

Pointing this tool at bigger sites/wikis can take quite a long time, and it can be difficult to be sure it's actually doing something until it finishes or I force-kill it.

It would be very useful to have some kind of progress indication, whether that would be something simple like displaying the current page being processed, a count of words found, etc. This would make it much easier to be sure it's working and to gauge how far along it is.

Perhaps this issue is moreso about it being slow with the way I'm running it, but pointing it at something like a wikipedia article with default depth took over a day and I ended up killing it. (I later realized a depth of 2 was likely too high, but the point remains)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/digininja/CeWL/issues/71, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA4SWKF4R3LPP7IAGJHW73S3SIDHANCNFSM4WQ3Q4UA .

zedseven commented 3 years ago

I know, that's why all my suggestions were things that did not rely on knowing a total number (count of words found, current page, etc). My point was that it was difficult to be sure it was doing anything, as there was no visual indication.

I guess -v will work for now.