Closed thebigG closed 4 years ago
Merging #82 into master will increase coverage by
12.56%
. The diff coverage is80.00%
.
@@ Coverage Diff @@
## master #82 +/- ##
===========================================
+ Coverage 45.80% 58.37% +12.56%
===========================================
Files 13 13
Lines 1109 1146 +37
===========================================
+ Hits 508 669 +161
+ Misses 601 477 -124
Impacted Files | Coverage Δ | |
---|---|---|
jobfunnel/jobfunnel.py | 47.68% <50.00%> (+15.18%) |
:arrow_up: |
jobfunnel/indeed.py | 87.32% <94.11%> (+59.22%) |
:arrow_up: |
jobfunnel/glassdoor_dynamic.py | 16.52% <0.00%> (+16.52%) |
:arrow_up: |
jobfunnel/__main__.py | 35.89% <0.00%> (+35.89%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update df8b8a8...8684f9d. Read the comment docs.
looking good!
only nitpick is we should wrap lines at 100 chars, but I dont want that to hold this up.
Yes! Just changed editors(using geany now) recently 😅 Will be sure to enforce this in the future.
Hey @thebigG really appreciate your addition of testing to the code-base, definitely makes the project seem more professional. Although the pull has already been merged, I left a comment regarding a change to the Regex code in indeed.py which could have a performance impact.
Although the pull has already been merged, I left a comment regarding a change to the Regex code in indeed.py which could have a performance impact.
I saw. Thanks a lot for pointing this out! I don't have time right now to look at these. I will have a new schedule/routine so need some time to re-adjust and find a new time slot for JobFunnel.
Indeed Tests
Hello everyone, Hope you are all doing well!
Description
I have added unit tests to
jobfunnel/indeed.py
. These tests were challenging because my goal was to isolate every piece of functionality of the indeed scraper so that if something fails in the future, we can easily find the point of failure. The unit tests were even more challenging because I aimed to make these tests as reliable as one can make them to be, given that they do depend on the network. I did not implement any flaky tests because flaky tests are not as reliable, given that they are flaky tests. Rather the approach I took was to, in some cases, have loose assertions. What I mean by this is that in cases where there was something that was not crucial to the job listing, such as the blurb, the only thing I ensure is that some(at least 1) job listing is returning a blurb of some kind. The data that I considered essential was the job id and the job link, since without these two, we don't have a job to apply to. The job id testtest_get_id
is the most strict test because without a valid job id we do not have a job at all. By most strict I mean if any one job posting fails to yield an id, the entire pytest suite fails. Note that I don't test tags at all because they are added "bonus" data we give to users, and many sites don't even yield any tags whatsoever. Hopefully this is clear in the tests onindeed.py
.Another challenge I came across was the fact that the way we were fetching data such as date, id, title, etc was not very well suited for testing in the sense that did not have these functionality in functions. Because of this I found myself copy pasting this code for all of the tests, and that's not a very efficient or maintainable way to write our tests. So I took the liberty of refactoring all of this functionality and writing wrapper functions for it and making them part of
jobfunnel.py
andindeed.py
. You can see the details of this refactoring change on this commit b33c18c329e55894a35313651586128cef87b89e. Hopefully we can implement this type of refactoring across all of the scrapers and make that code a little easier to maintain.readme updates and Docker
I have updated the readme and added some information about our new GlassDoor dynamic and static scheme. Hopefully that is clear enough for users.
I was wondering what your thoughts are on adding a Ubuntu Xenial Docker image pre-built for JobFunneel where developers can pull from and have a clean environment to do testing on? I do testing like this all the time to make sure my development environment does not affect JobFunnel testing and I think other people could benefit from having an image on Docker hub where they don't have to install all of the python dependencies manually and can just do
docker pull
or something like that.Hope these changes make sense, Cheers!
Context of change
Please add options that are relevant and mark any boxes that apply.
Type of change
Please mark any boxes that apply.
How Has This Been Tested?
Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration.
test_indeed.py
in isolationChecklist:
Please mark any boxes that have been completed.