ArchitecturalKnowledgeAnalysis / EmailDatasetBrowser

Application for interacting with datasets produced by the EmailIndexer.
MIT License
3 stars 1 forks source link

How does downloading emails in the GUI work? #11

Closed wmeijer221 closed 2 years ago

wmeijer221 commented 2 years ago

I tried downloading emails from the apache mailing archive, however, either get an error, the download does seemingly nothing or it downloads nothing. Hence, my question. Additionally, when it does do something, it doesn't only handles, like 10-ish months, which is not what I put in the settings at all.

Thanks!

andrewlalis commented 2 years ago

image

For example, in this screenshot, I'd like to download the last 10 years of emails from the Apache Accumulo project. So then, I would download the emails like this:

image

What I think you might be seeing with the "10-ish months" is that the fetcher will quit after attempting to retrieve 10 consecutive months of emails where there are none. You can see exactly what's going on here. The reason for this is that I've previously been IP-banned by apache for scraping their jira issues too quickly, so I'm taking a conservative approach here and using a 1-second interval between downloads, and quitting once it appears that there's no point in continuing.

However, if it's still not clear what's going on, you can send me the exact info you filled into the download popup and I can debug the scenario.

wmeijer221 commented 2 years ago

Right! I was doing something completely different xD Thanks!