pfmc-assessments / sa4ss

Generate a stock assessment document from Stock Synthesis output
https://pfmc-assessments.github.io/sa4ss/
Other
12 stars 7 forks source link

spell check instructions #61

Open kellijohnson-NOAA opened 2 years ago

kellijohnson-NOAA commented 2 years ago

Problem

Spell check across multiple .Rmd files is difficult, especially if they include code and LaTex.

Proposal

💡 ?

:cucumber:

More talk on the 2021 lingcod repository in Issue #70.

chantelwetzel-noaa commented 2 years ago

I did my spell checking via a very inelegant way that was not described in the lingcod discussion. I converted the compiled pdf to a word document which allowed me to check spelling across text, tables, and figures. The annoying part, of course, was then fixing errors by hand in each of the Rmd files. Not great but it worked for this assessment cycle.

shcaba commented 2 years ago

An option is to open each Rmd file in RStudio and use the spellcheck function under the "Edit" tab. It seems to overlook LaTex stuff, so doesn't get bogged down. Works well in my opinion.

kellijohnson-NOAA commented 2 years ago

Thanks for the feedback on your workflow. I will try to document some options.

kellijohnson-NOAA commented 1 year ago

@k-doering-NOAA and @iantaylor-NOAA do you have suggestions for how to do spell check on .Rmd files after working with the spell check and packages through nmfs-fish-tools/ghactions4r/issues/36?

iantaylor-NOAA commented 1 year ago

It appears that devtools::spell_check() is very specific to R packages and only checks the .Rd files and optionally the vignettes. However it is also apparently just a wrapper for https://docs.ropensci.org/spelling/, which is also recommended in the R markdown cookbook: https://bookdown.org/yihui/rmarkdown-cookbook/spell-check.html.

I have not used {spelling}, directly but am happy to help research this.

Also, there's still value in checking the Rd files. I just ran devtools::spell_check() on the lingcod repo and it finds some errors that I made like

sensitivies         run_sensitivities.Rd:38
senstiviity         run_sensitivities.Rd:34
senstivity          sens_make_table.Rd:5
iantaylor-NOAA commented 1 year ago

Following the example for ?spelling::spell_check_files, I ran the following in the lingcod repository

files <- list.files("doc", pattern = "\\.(Rnw|Rmd|html)$", full.names = TRUE)
spelling::spell_check_files(files)

which found a bunch of non-standard words, none of which were actually misspellings. But I think that's because @kellijohnson-NOAA already did some form of spell check on the .tex files.

  WORD                   FOUND IN
abc                    01executive.Rmd:461        
                       16management.Rmd:5,34,40,57
                       30model.Rmd:260
acrlong                29data-notused.Rmd:1       
                       surveycomp.Rmd:26,131      
                       surveyindex.Rmd:1,55       
addlinespace           52tables.Rmd:200
admb                   30model.Rmd:30,41
afsc                   surveyindex.Rmd:57
...

I think {sa4ss} could have a default WORDLIST that would include many of these things which would then be easy for individual assessment authors to update via spelling::update_wordlist().

iantaylor-NOAA commented 1 year ago

Note that AFS has a list of individual words used in common and scientific names of fish (not full species names) for addition to spell checkers available from https://fisheries.org/books-journals/writing-tools/fishnames/.

iantaylor-NOAA commented 1 year ago

Bumping this issue in case it's useful for @okenk and @brianlangseth-NOAA.

Running

  files <- list.files("documents", pattern = "\\.(Rnw|Rmd|html)$", full.names = TRUE)
  spelling::spell_check_files(files)

Worked for me to produce a long list of 99% stuff to ignore, but easy enough to skim through to find the useful 1% like samplers and varability.

Example of the top of the list (I think all these belong as they are):

  WORD                 FOUND IN
ACL                  11introduction.Rmd:54   
ACLs                 11introduction.Rmd:54,56
acrlong              21s-wcgbts.Rmd:1        
ADMB                 34diagnostics.Rmd:4     
AFSC                 21s-wcgbts.Rmd:43,45    
ageing               21f-.Rmd:43,47
                     22biology.Rmd:62,64
Ageing               21f-.Rmd:47
                     22biology.Rmd:60,64
agg                  33results.Rmd:24,28
al                   01executive.Rmd:191,193,195
                     11introduction.Rmd:14,30,32,34,36
                     21f-.Rmd:20
                     22biology.Rmd:9
                     23enviro.Rmd:3,5,7
                     31summary.Rmd:24
                     32structure.Rmd:27
                     34diagnostics.Rmd:4
aL                   22biology.Rmd:37
Alverson             11introduction.Rmd:43
Amax                 22biology.Rmd:58
amphipods            11introduction.Rmd:26
arabic               10a.Rmd:1
Baja                 11introduction.Rmd:4
bathymetric          11introduction.Rmd:6
BDS                  21f-.Rmd:39
Bertalanffy          01executive.Rmd:57
                     22biology.Rmd:58
Beverton             01executive.Rmd:57,135
                     32structure.Rmd:27
biasadj              33results.Rmd:44
bratio               32structure.Rmd:7,25,27
bycatch              21f-.Rmd:26
caal                 33results.Rmd:30
CAAL                 21s-wcgbts.Rmd:39
                     33results.Rmd:30
CalCOM               21f-.Rmd:20,39,43
catchability         01executive.Rmd:267,269
                     21s-wcgbts.Rmd:6
                     34diagnostics.Rmd:15
                     40management.Rmd:27
CDFW                 21f-.Rmd:20,39
Chantel              41acknowledgments.Rmd:7,8
okenk commented 1 year ago

rstudio has a pretty good spell checker built in! (i.e., the typo has a little red underline in the editor)