PaulMcInnis / JobFunnel

Scrape job websites into a single spreadsheet with no duplicates.
MIT License
1.81k stars 212 forks source link

Date filter #64

Closed thebigG closed 4 years ago

thebigG commented 4 years ago

Pull Request Template

Description

The user can filter out jobs that they think are "too old". This is measured in days by setting the threshold_days in the settings.yaml file. I found myself finding many jobs that are too old, like more than a month old. Many times I came across jobs that had been taken down because of how old they were. Libraries/modules affected:

List any developers that will be affected or those who you had merge conflicts with.

Context of change

Please add options that are relevant and mark any boxes that apply.

Type of change

Please mark any boxes that apply.

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Provide instructions so we can reproduce. Please also list any relevant details for your test configuration. In order to activate the filter, the thershold_days variable in settings.yaml must be greater than or equal to 0.

Checklist:

Please mark any boxes that have been completed.

thebigG commented 4 years ago

Nice catch @markkvdb! If we just checked for positive integers in jobfunnel/config/validate.py, wouldn't that invalidate negative numbers for threshold_days? I was thinking of using -1 to give the user an option to completely turn off date_filter if they wanted to. Or do you think allowing negative values could result in confusion from the user's perspective? On the one hand, "turning it off" with a negative value is very much the same as setting threshold_days to a very high number(like 100).

Another question: What are your thoughts on the default value for threshold_days? So far the consensus is 20.

markkvdb commented 4 years ago

Nice catch @markkvdb! If we just checked for positive integers in jobfunnel/config/validate.py, wouldn't that invalidate negative numbers for threshold_days? I was thinking of using -1 to give the user an option to completely turn off date_filter if they wanted to. Or do you think allowing negative values could result in confusion from the user's perspective? On the one hand, "turning it off" with a negative value is very much the same as setting threshold_days to a very high number(like 100).

I don't think using -1 as some magical number is a good idea. New users will probably get confused when seeing that they can set threshold_days to a negative number. Instead, similar to the proxy setting, we could make the threshold_days option optional (hehe). Not defining threshold_days in the config file should not give errors.

In short, threshold_days should in my opinion reflect the number of days or should be left out.

Another question: What are your thoughts on the default value for threshold_days? So far the consensus is 20.

I think 30 might even make more sense. Some jobs are hard to fill and seeing a job opening for 30 days is not uncommon or weird.

markkvdb commented 4 years ago

Now that I think of it, maybe threshold_days is not the clearest name possible. Instead, max_listing_age, max_listing_days or max_days_listing might be good alternatives. Personally, this contains more information about the function of the option than threshold_days.

thebigG commented 4 years ago

I don't think using -1 as some magical number is a good idea. New users will probably get confused when seeing that they can set threshold_days to a negative number. Instead, similar to the proxy setting, we could make the threshold_days option optional (hehe). Not defining threshold_days in the config file should not give errors.

You're right. I think having it as a command line option might be the best way to go to avoid confusion.

Now that I think of it, maybe threshold_days is not the clearest name possible. Instead, max_listing_age, max_listing_days or max_days_listing might be good alternatives. Personally, this contains more information about the function of the option than threshold_days.

Indeed. Will be renaming it to max_listing_days. Will be pushing these changes ASAP.

markkvdb commented 4 years ago

Nice job! One minor thing: you could update the version number of JobFunnel 😉