Open Jacq4nn opened 2 years ago
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing:
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing: 1.5 hours
Running avg_word_len("it's me")
results in an output of 3
which is unexpected. I am not sure what is causing this problem. I would have expected either 1.66
if the three "words" are "it", "s" and "me", or 2.5
if the words are "its" and "me".
Running perc_cap_words("I")
returns 0
which is unexpected behaviour. This could be because in the function, you have stringr::str_count(text, "\\b[A-Z]{2,}\\b")
which looks for words that contain at least 2 characters. Thus, running perc_cap_words("I AM A BOY")
returns 50
instead of 100
.
It would be great to add automated testing, which I'm sure you will include soon!
It would be nice to add the Contributing
and License
sections to the README so that it is clear how you want other people to work on the package.
It would be nice to have a vignette which demonstrates use of the functions in a single file and maybe even host it online.
In the future, I think it would be nice to include functions that compute the percentage of words in all lower case, and maybe the median word length.
Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing:
Well done team! I am sure the project is in progress and you are improving it as I write this review. Here are my comments:
else
here is necessary since you're returning from the function when the previous if
condition is true: https://github.com/UBC-MDS/textfeatureinfor/blob/b863b56bab2c00b82c8a05bd8545096d9958b5bd/R/textfeatureinfor.R#L72Please check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing: 30 minutes
Overall well done, this package definitely seems useful for NLP tasks and could assist with feature engineering. I understand that the package is in progress so you may already be working on some of the comments below.
unknown
install.packages("devtools")
in the installation
section of the README you could say that the user needs to have this package installed to install your package this waycontributing
and license
sections to your READMEPlease check off boxes as applicable, and elaborate in comments below. Your review is not limited to these topics, as described in the reviewer guide
The package includes all the following forms of documentation:
URL
, BugReports
and Maintainer
(which may be autogenerated via Authors@R
).Estimated hours spent reviewing: 1hr
Overall well done team! R does not seem to have library to extract punctuations so it might be a little bit harder for building the R package. Here are my comments:
remove_stop_words
function, you are using "stopwords-iso" as a reference of stop word. It would be good to mention it in the documentation. count_punc
returns error when dealing "\". For example, it returns error when count_punc("\")
and zero when count_punc("\\")
avg_word_len("it's me.")
now correctly returns 1.66667
. Good job!
name: textfeatureinfor about: This R package sxtract information from text features which can be useful for feature engineering, or in other data science projects
Submitting Authors:
Repository: textfeatureinfor Version submitted: 0.0.0.9 Submission type: Standard Editor: RB Reviewers:
Archive: TBD Version accepted: TBD
Scope
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
Explain how and why the package falls under these categories (briefly, 1-2 sentences):
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or @tag the editor you contacted.
Explain reasons for any
pkgcheck
items which your package is unable to pass.Technical checks
Confirm each of the following by checking the box.
This package:
Publication options
[ ] Do you intend for this package to go on CRAN?
[ ] Do you intend for this package to go on Bioconductor?
[ ] Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
MEE Options
- [ ] The package is novel and will be of interest to the broad readership of the journal. - [ ] The manuscript describing the package is no longer than 3000 words. - [ ] You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see [MEE's Policy on Publishing Code](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/journal-resources/policy-on-publishing-code.html)) - (*Scope: Do consider MEE's [Aims and Scope](http://besjournals.onlinelibrary.wiley.com/hub/journal/10.1111/(ISSN)2041-210X/aims-and-scope/read-full-aims-and-scope.html) for your manuscript. We make no guarantee that your manuscript will be within MEE scope.*) - (*Although not required, we strongly recommend having a full manuscript prepared when you submit here.*) - (*Please do not submit your package separately to Methods in Ecology and Evolution*)Code of conduct