Closed vsoch closed 4 years ago
Merging #43 into master will increase coverage by
0.05%
. The diff coverage is84.61%
.
@@ Coverage Diff @@
## master #43 +/- ##
==========================================
+ Coverage 77.67% 77.72% +0.05%
==========================================
Files 20 20
Lines 645 651 +6
==========================================
+ Hits 501 506 +5
- Misses 144 145 +1
Impacted Files | Coverage Δ | |
---|---|---|
urlchecker/core/check.py | 83.33% <0.00%> (ø) |
|
urlchecker/core/fileproc.py | 91.30% <85.71%> (+1.06%) |
:arrow_up: |
tests/test_core_urlproc.py | 100.00% <100.00%> (ø) |
|
urlchecker/core/urlmarker.py | 100.00% <100.00%> (ø) |
|
urlchecker/version.py | 100.00% <100.00%> (ø) |
|
urlchecker/core/urlproc.py | 92.47% <0.00%> (-1.08%) |
:arrow_down: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update af9d7ac...7cb765a. Read the comment docs.
This seems like an okay fix for now but I will need to optimize this as it seems redundant. I couldn't test the regex online so can you provide a link for your test-project so I can check it manually.
Sure! We can definitely "optimize the string" if you think this is important - I think it's actually okay, since it's all clearly printed there for the developer to see, and we can be sure it's exactly what we had before. The test file I used was a jobs.yml file, here:
- {expires: 2020-06-10, location: 'Pasadena, CA', name: Research Software Engineer
at Caltech, url: 'https://www.linkedin.com/jobs/view/1689896027/'}
- {expires: 2020-06-10, location: 'Mountain View, CA', name: Research Software Engineer
at Google, url: 'https://www.linkedin.com/jobs/view/1817277202/'}
- {expires: 2020-06-10, location: 'Atlanta, GA', name: 'Architect, PACE team at Georgia
Institute of Technology', url: 'https://pace.gatech.edu/xsedeosg-architect'}
- {expires: 2020-06-10, location: 'Atlanta, GA', name: 'Scheduler Architect, PACE
team at Georgia Institute of Technology', url: 'https://pace.gatech.edu/scheduler-architect'}
- {expires: 2020-06-10, location: 'Socorro, NM', name: Staff Scientist/Software Engineer
at New Mexico Tech, url: 'https://www.nmt.edu/hr/docs/hr/jobs/StaffSciSoftEngIRIS20-030.pdf'}
- {expires: 2020-04-10, location: 'Boston, MA', name: Senior Research Computing Systems
Engineer at Boston University, url: 'http://www.bu.edu/tech/support/research/rcs/jobs/'}
- {expires: 2020-03-10, location: Caltech, name: 'Software Engineer/Research Software
Engineer: CliMA project', url: 'https://clima.caltech.edu/join-our-team/'}
- {expires: 2020-02-15, location: NumFOCUS, name: Research Software Engineering Fellow,
url: 'https://numfocus.org/blog/now-hiring-matplotlib-research-software-engineering-fellow'}
- {expires: 2020-02-01, location: Pittsburgh Supercomputing Center, name: Research
Software Specialist, url: 'https://www.psc.edu/about-psc/employment/3116-research-software-specialist-2014428'}
- {expires: 2020-03-01, location: University of Alabama, name: Research Software Engineer,
url: 'http://carver.cs.ua.edu/RSE.htm'}
- {expires: 2020-02-01, location: University of Illinois at Urbana-Champaign, name: Visiting
Research Scientist, url: 'https://jobs.illinois.edu/academic-job-board/job-details?jobID=123477&job=visiting-research-scientist-school-of-information-sciences-123477'}
- {expires: 2019-10-01, location: 'Boston, MA', name: Senior Scientific Programmer
at Boston University, url: 'http://www.bu.edu/tech/support/research/rcs/jobs/'}
- {expires: 2020-06-30, location: 'Cambridge, MA', name: Senior Research Software
Engineer (Data Engineer), url: 'http://bit.ly/fasrc_senior_rse'}
- {expires: 2019-09-01, location: University Park Campus, name: Software Consultants,
url: 'https://psu.jobs/job/88890'}
- {expires: 2019-10-01, location: University of Arizona, name: Research Programmer,
url: 'https://uacareers.com/postings/40792'}
- {expires: 2019-10-01, location: Indiana University Bloomington, name: Research Software
Engineer / Application Support Engineer, url: 'https://brainlife.io/docs/careers/jobs/'}
- {expires: 2019-10-01, location: Indiana University Bloomington, name: Research Software
Engineer / UI Engineer, url: 'https://brainlife.io/docs/careers/jobs/'}
- {expires: 2019-10-01, location: Indiana University Bloomington, name: Community
Developer / Outreach Coordinator, url: 'https://brainlife.io/docs/careers/jobs/'}
- {expires: 2019-10-01, location: Indiana University Bloomington, name: DevOps / Software
Engineers, url: 'https://brainlife.io/docs/careers/jobs/'}
- {expires: 2019-10-01, location: Indiana University Bloomington, name: Cloud and
Cluster Administrator, url: 'https://brainlife.io/docs/careers/jobs/'}
- {expires: 2019-11-01, location: Brown University, name: Research Software Engineer/Senior
Research Software Engineer, url: 'https://brown.wd5.myworkdayjobs.com/en-US/staff-careers-brown/job/180-George-Street/Research-Software-Engineer-Senior-Research-Software-Engineer_REQ162388'}
- {expires: 2020-01-15, location: 'Princess Margaret Cancer Centre, Toronto, ON, Canada',
name: 'Technical Software Developer, Data and Distributed Computing, CanDIG', url: 'https://www.recruitingsite.com/csbsites/uhncareers/JobDescription.asp?SiteID=10031&JobNumber=852516'}
Then ran like
urlchecker check --files jobs.yml --file-types .yml .
And something like this should be added to the test repo to ensure it doesn't trigger again (you said you were working on this, or not anymore?)
Sorry it took me some time to test this, I thought I can find a nice hack to fix this by editing urls = [url for url in urls if not re.search("(\\{[a-z0-9.]*})", url)]
but it seems that I will need some time to test that better. For now since this is a bug, I think you wrote a nice fix and as you said -we know that this works- so let's use this for now.
Concerning the testing, I am still on it. Hopefully this weekend, I will do it right this time. Unfortunately I cannot do much during the week days, since I have a lot to do at work atm and when I come home, I am basically done with coding.
Thank you for the speedy review! No worries on the tests, I'll link the comment here to the issue so you can easily find the file and copy paste.
I'm going to draft a new release for urlchecker-action now with the fix.
This will fix #42 by way of adding back the original URL regex, to be run across urls individually after they are cleaned for }. The bug is that this cleaning was removed but not added back after the custom parsing. The
FINAL_REGEX
could be made to include a combination of the domain_extensions, but this is the quickest / easiest solution for now (representing the entire string for the regex).Signed-off-by: vsoch vsochat@stanford.edu