ropensci / spelling

Tools for Spell Checking in R
https://docs.ropensci.org/spelling
Other
105 stars 27 forks source link

URLs in DESCRIPTION #37

Closed maelle closed 5 years ago

maelle commented 5 years ago

cf #28 (not sure my regex is good enough though)

line <- "Package using hunspell typooo <https://docs.ropensci.org/hunspell>"
spelling:::spell_check_file_text(textConnection(line), dict = hunspell::dictionary())
#>    https hunspell ropensci   typooo 
#>      "1"      "1"      "1"      "1"
spelling:::spell_check_description_text(textConnection(line), dict = hunspell::dictionary())
#> hunspell   typooo 
#>      "1"      "1"

Created on 2019-05-27 by the reprex package (v0.2.1)

jeroen commented 5 years ago

Your regex is too aggressive, for example try this string:

This is a url <google.com>. Here is more text. Blabla. Use version >= 2.
codecov-io commented 5 years ago

Codecov Report

Merging #37 into master will decrease coverage by 1.84%. The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #37      +/-   ##
==========================================
- Coverage   45.36%   43.52%   -1.85%     
==========================================
  Files           7        7              
  Lines         313      363      +50     
==========================================
+ Hits          142      158      +16     
- Misses        171      205      +34
Impacted Files Coverage Δ
R/spell-check.R 31.66% <0%> (+0.83%) :arrow_up:
R/check-files.R 63.15% <0%> (-2.6%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 04ba5cb...73c5ed6. Read the comment docs.

maelle commented 5 years ago

Good point, I made it less greedy.

jeroen commented 5 years ago

There are cases of false positives when there is not a URL but the text has a < and then somewhere else later in the text a > sign and a lot of text in between. It needs to be more specific than that, for example <http\\S*?> which only matches if it starts with http and there are no spaces within the <...>

maelle commented 5 years ago

believe it or not, I had checked the correct one in my untitled script but copy-pasted the wrong part :woman_facepalming: