cran-task-views / NaturalLanguageProcessing

CRAN Task View: Natural Language Processing
https://CRAN.R-project.org/view=NaturalLanguageProcessing
4 stars 7 forks source link

General task view improvements #1

Closed zeileis closed 2 years ago

zeileis commented 2 years ago

Format: We have converted the old XML-based task views file now to the new Markdown-based format. Please check the resulting .md file whether everything is in order or some corrections/improvements are needed. See the corresponding documentation for more details.

Update: This would also be a good time to check the entire task whether it is up to date and correct. A write-up of the general goals and content guidelines is now summarized in the workflow for proposing new task views (which was not yet available when you authored your task view). Improving your task view to conform more closely with these goals and guidelines, if necessary, would be highly appreciated.

Co-maintainers: In particular, we strongly encourage that all task views not only have a principal maintainer but also 1-5 co-maintainers in order to get more inspirations for extensions/improvements and share the maintenance workload. So please try to extend the list of co-maintainers, ideally also including female contributors and generally involving persons with diverse backgrounds.

davidjohannesmeyer commented 2 years ago

Hi Frido @fwild,

Did you have time to look at the migrated task view, in particular the NaturalLanguageProcessing.md file?

fwild commented 2 years ago

Not yet - April 5 deadlines for grant proposals being in the way! But soon!

zeileis commented 2 years ago

Thanks for the follow-up, Fridolin. Please take at least 15 minutes to run

install.packages("ctv", repos = "https://R-Forge.R-project.org")
ctv::ctv2html("NaturalLanguageProcessing.md", cran = TRUE)
browseURL("NaturalLanguageProcessing.html")

and go through the resulting page to check whether all elements have been converted correctly and the task view is ok for publication on CRAN.

The more thorough update and the co-maintainers can wait until April.

fwild commented 2 years ago

Hi Achim,

checked it - but I got an R-Forge error:

install.packages("ctv", repos = "https://R-Forge.R-project.org/") Warning in install.packages : unable to access index for repository https://R-Forge.R-project.org/bin/macosx/contrib/4.0: cannot open URL 'https://R-Forge.R-project.org/bin/macosx/contrib/4.0/PACKAGES'

Guess I'd have to wait for that to become available. I managed to install it form a standard source, though - of course with only the econometrics task view included.

I then finally managed to get around that by downloading the file directly and running ctv2html and browseURL on it, and it looks all good - only the related links and other resources look broken:


Related links

 c("

 The KMi", "cRunch tutorials
 ")
 c("

 A Gentle Introduction to", "Statistics for (Computational) 

Linguists (SIGIL) ") c("

 Stefan Th.", "Gries (2009): Quantitative Corpus Linguistics with R, 

Routledge. ") c("

 ttda:", "Tools for Textual Data Analysis (Deprecated)
 ")
 c("

 Corpora and NLP model packages at", "http://datacube.wu.ac.at/
 ")

Other resources

 c("

 GitHub Project: golgotha
 ")
 c("

 GitHub Project: regex-performance
 ")
 c("

 GitHub Project: sentencepiece
 ")
 c("

 Omegahat Package: Rstem
 ")

I guess this is something in the parser that is broken here? The latter section is generated from the package list - listing all those packages that are not on CRAN (one omegahat one and three GitHub packages). It makes sense to me that these would be listed similar to the package list above? And only the related links section stays as such?

The links look correct to me in their markdown syntax:

What do I need to do to fix these?

Best Fridolin

On 15/03/2022 14:28, Achim Zeileis wrote:

Thanks for the follow-up, Fridolin. Please take at least 15 minutes to run

|install.packages("ctv", repos = "https://R-Forge.R-project.org") ctv::ctv2html("NaturalLanguageProcessing.md", cran = TRUE) browseURL("NaturalLanguageProcessing.html") |

and go through the resulting page to check whether all elements have been converted correctly and the task view is ok for publication on CRAN.

The more thorough update and the co-maintainers can wait until April.

— Reply to this email directly, view it on GitHub https://github.com/cran-task-views/NaturalLanguageProcessing/issues/1#issuecomment-1068052500, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABMYAG6S3W34EI5GKOFZWLVACNBVANCNFSM5PF7AJSQ. Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

-- Performance Augmentation Lab School of Engineering, Math, and Computing Oxford Brookes University, Oxford OX33 1HX fon +44-(0)1865-484584 http://pal.cct.brookes.ac.uk

zeileis commented 2 years ago

Regarding the installation: You have configures your R options to only look for binary OS X packages which are not provided by R-Forge. But you can always install the source version. So if you get a warning like this, always try adding type = "source".

zeileis commented 2 years ago

As for the problems you list with the links, I cannot replicate these issues on my end. Not sure if this is an issue with the pandoc version maybe or something different on OS X. In any case, the Markdown for the links looks perfectly fine so nothing to be changed on your end in the markup.

I noticed though, that three of the URLs in the links do not work anymore, namely for KMi cRunch, Gries 2009, and ttda. Could you please update these?

zeileis commented 2 years ago

OK, found and fixed the problem. Forgot to set options = "--wrap=none" when processing links with pandoc. Apparently the defaults changed between versions. Thanks for the pointer and sorry for the confusion.

zeileis commented 2 years ago

@fwild could you please have another quick look at the following two issues?

What I could update: Instead of the Gries (2009) reference, I now refer to the 2nd edition (2017).

fwild commented 2 years ago

All suggestions done now, checked visually and tested. Ready to go!

zeileis commented 2 years ago

Excellent, Fridolin, very much appreciated! The project is "Public" now and listed on the main ctv page.

If you can think of potential co-maintainers in the next weeks/months, that would still be a good idea, I think.