Closed kamapu closed 3 years ago
Thanks for your submission @kamapu, we're discussing and will get back to you soon.
:wave: again @kamapu, it's a bit of a tricky situation and we're going to need more information about your package before making a decision.
The general question is: how would an R user with a taxonomy problem at hand know how to choose one of the two packages?
Your answer could be a table as in this pre-submission inquiry or a decision tree. I know it's a bit of work that might not even warrant onboarding, but it'd be something useful to have in the docs of your package (and of taxa
probably) since the existence of both packages even not under the rOpenSci's umbrella mean users might have to make a decision.
Thanks for your patience!
:wave: @kamapu! Any update or answer on the above? Thank you!
π @kamapu! Any update or answer on the above? Thank you!
Dear @maelle, there are not really updates on this issue. At the moment I am for a longer period in the field (Kenya). Though, it will be difficult for me to really document differences between taxlist
and taxa
. I only can tell in favour of taxlist
that this package is implemented in vegtable
, which is meant as a container for vegetation-plot databases in R. In fact, taxlist
development runs in parallel to vegtable
but they are kept in separated packages due to the complexity of taxonomic structures and the potential integration of taxonomic lists in other object classes.
Now my question is: Can I submit two packages as a bulk in ROpenSci or should I leave taxlist
and try to submit vegtable
alone?
Best regards!
One (pre-)submission by package is best.
Have a productive time in the field!
Dear @maelle:
I went through most of the functions in the manual of taxa
to have a better impression about differences between this package and taxlist
. I have to recognize, it has been hard for me to understand the way how taxa
deals with taxonomic information, since its programming style is a higher level than my primitive skills. Thus, the following comments should be interpreted as impressions rather than truth.
Basically both mentioned packages fulfil similar tasks but use different approaches, taxa
uses a function-oriented approach (R6
), while taxlist
uses a data-oriented one (S4
). Both packages attempt to provide an object class containing taxonomic information, but taxlist
is strongly focusing on potential applications in vegetation-plot databases. taxlist
allows diverse degrees of completeness, starting from just listing of taxonomic entities up to data sets including taxonomic ranks, synonyms, ecological traits (or further taxon attributes), and taxon views.
Further aspects are probably not implemented in taxa
and can be considered as the contribution of taxlist
to ROpenSci:
taxa
is importing information from several on-line databases, taxlist
offers alternatives of import data from local
data sets. For instance data loaded in data frames can be converted by using the function df2taxlist()
. There is also a function called tv2taxlist()
for importing species lists included in TurbovegS4
objects. In the case of taxlist
, potential issues such as duplicated combinations, parent-child relationships that are not consistent to taxonomic ranks, etc. will generate error messages when creating new objects. It is not clear for me how those issues are handled in taxa
taxa
, which is a crucial feature in taxlist
, assuming that all databases including data from different sources have to implement a way of linking taxon usage names (either accepted names or synonyms) with taxon concepts. There is no mention to taxon views in taxa
, eitherbackup_object()
for inserting time stamps to backups produced by save()
dissect_name()
(split names into single terms) and match_names()
for comparing character values with names included in taxlist
objectsknitr
and rmarkdown
for inserting taxon names in respective documents using the function print_name()
Additional features in development, which may not be yet included in taxa
are:
taxlist
enable single taxon views per taxon concept)taxlist
objects into the Veg-X standardI hope the previous comments properly support taxlist
as a package rather complementary to taxa
than redundant and make it eligible for ROpenSci. Disregarding the final decision, a way to make data sets exchangeable among those two applications should be strongly recommended.
Thanks for looking through the docs/code @kamapu! Here are some thoughts:
While taxa is importing information from several on-line databases, taxlist offers alternatives of import data from local data sets.
taxa
can read local datasets using parse_tax_data
, but it does not have any parsers for specific local formats, since we wanted to keep everything generic.
The consistency of information included in taxonomic lists is "cross-checked" by validity in S4 objects.
Yep, taxa
uses R6 classes which are not strong typed, so users can break the objects if they set fields manually instead of using our dplyr-like manipulation functions.
I don't see any mentions to synonymy in taxa, which is a crucial feature in taxlist
There is no such feature explicitly and we dont have plans for one at the moment
A plotting function displaying relations between taxon as dendrograms
taxa
uses print_tree
for this
A function exporting the content of taxlist objects into the Veg-X standard
I have no plans to support import or export of specific file formats in taxa
, since it is primarily a data manipulation standard.
thanks @kamapu for the reply - we're discussing now
@kamapu We've decided to proceed. We do feel it is important though to have functions to convert between the two packages major data structures/classes though, in both packages if possible. your editor @maelle will follow up with more information.
@sckott These are great news! I will put some effort to produce the required functions though I may require the support of @zachary-foster.
:wave: @kamapu! I think it'd be great to add the conversion functions before the onboarding process. I'll put this submission on hold, but in the meantime, feel free to ask any questions.
OK, I hope, there is no hurry (I'm already busy with the closure of 2018). You may not wander, if I use the gap between Christmas and New Year for it :anguished:
No problem, I might ping you once in a while but that's fine. Please do join the Slack as soon as you want (I sent you an invite).
:wave: @kamapu! Happy New Year! Any update?
@maelle Happy New Year! I though, the day will come: I had no time but ways thinking about every day. Now I'll start the discussion with @zachary-foster
:wave: @kamapu, any update? Note that I'll be unavailable from sometime between the beginning of and mid-June, until mid-October, so your editor would be @noamross after that point.
Hello @maelle and @noamross
There is one recent step here
We are working in a function for converting taxlist
objects into Taxmap
.
π @noamross
After a while, we managed to write a function exporting and importing taxlist
objects to taxmap
. That is to say, objects could move from taxlist to taxa and back (see here).
Thus, the submission can proceed.
Thank you @kamapu for the update! Before assigning reviewers, we still need the package to have a test suite, with test coverage reporting of at least 75%.
Dear @noamross Do you mean, I should follow this instructions? I'm a self-made programmer and now I'm feeling like someone, who is driving without driving license
I'm a self-made programmer and now I'm feeling like someone, who is driving without driving license
As are most of us! Yes, that's where to start for writing tests.
:wave: @kamapu! Any progress on the tests? Any help needed? π
Uff! Time is not at my side. I'll give a try and come back to you.
@maelle I managed to implement the test suite. At the moment taxlist
has the requested 75% of code coverage. I'll try to improve it but I though, we can proceed with the submission process. I hope, this was the hardest part (everything was new for me).
:wave: @kamapu! Great news! Congrats on adding the tests, I imagine it wasn't easy but you've learnt something very useful. :smile_cat:
A big ask I have before looking for reviewers would be a summary of https://github.com/ropensci/taxa/issues/130 in the form of a README section aimed at helping potential users choose between taxlist and taxa depending on their use case/coding style preferences. :wink: It'd be ideal if taxa then featured the same information cc @zachary-foster @sckott :-)
Is taxlist aimed at being mainly developer facing i.e. used as a dependency of other packages?
I'm getting an error when running the tests.
ββ 1. Error: (unknown) (@test-taxlist2taxmap.R
attempt to apply non-function
1: taxlist2taxmap(data1) at testthat/test-taxlist2taxmap.R:9
2: taxlist2taxmap(data1)
ββ testthat results βββββββββββββββββββββββββ
[ OK: 91 | SKIPPED: 0 | WARNINGS: 0 | FAILED: 1 ]
1. Error: (unknown) (@test-taxlist2taxmap.R#9)
goodpractice
output ββ GP taxlist βββββββββββββββββββββββββββββββββββββββββββββββββββββ
It is good practice to
β not use "Depends" in DESCRIPTION, as it can
cause name clashes, and poor interaction with other
packages. Use "Imports" instead.
β avoid long code lines, it is bad for
readability. Also, many people prefer editor windows
that are about 80 characters wide. Try make your lines
shorter than 80 characters
R/accepted_name.R:54:1
R/add_synonym.R:16:1
R/backup_object.R:32:1
R/basionym.R:48:1
R/clean.R:43:1
... and 36 more lines
β avoid sapply(), it is not type safe. It might
return a vector, or a list, depending on the input
data. Consider using vapply() instead.
R/dissect_name.R:8:9
R/dissect_name.R:10:9
β avoid 1:length(...), 1:nrow(...),
1:ncol(...), 1:NROW(...) and 1:NCOL(...) expressions.
They are error prone and result 1:0 if the expression
on the right hand side is zero. Use seq_len() or
seq_along() instead.
R/add_concept.R:73:42
R/df2taxlist.R:76:36
R/df2taxlist.R:76:64
R/load_last.R:35:17
R/match_names.R:20:27
... and 7 more lines
β not import packages as a whole, as this can
cause name clashes between the imported packages.
Instead, import only the specific functions you need.
A few things to fix, do not hesitate to ask for help via this thread/discuss.ropensci.org!
Reviewers: @mcsiple @levisc8 Due date: 2020-06-22
Any update, @kamapu? π
Still thinking about the best way to prepare a proper answer.
Dear @maelle you are right, I have learnt a lot and got some new skills for programming packages.
Although I still have to do the comparison with taxa
in README, I respond herewith to the rest of your comments in advance.
- Is taxlist aimed at being mainly developer facing i.e. used as a dependency of other packages?
Not and yes. The package taxlist
is handling taxonomic information within objects defined by the package vegtable
, the later handling biodiversity records. Nevertheless the users need to know objects and functions defined in taxlist
to exploit the capabilities of both packages.
- I'm getting an error when running the tests.
This is due to the function taxlist2taxmap()
, which is depending on the current version of taxa
at GitHub but is not working with the last version at CRAN. I requested @zachary-foster to update the CRAN version (see here). As soon as taxa
gets updated, this error should disappear.
goodpractice
output
All issues have been solved with exception of the second one ("avoid long code lines, it is bad for
readability"). I am also working with an editor of 80 characters wide and use to break code for readability. The problem is caused by strings included in stop()
and warning()
statements, which I prefer not to break. Since those lines use to be within functions and therefore indented, the width limit is exceeded. I hope this issue can be tolerated, otherwise I'll need to suppress the indentations in those specific lines.
Required changes have been committed to the master branch.
Pending tasks are therefore:
taxa
in READMEThank you for your work and answer!
*
This is due to the function taxlist2taxmap(), which is depending on the current version of taxa at GitHub but is not working with the last version at CRAN.
In the meantime please use a Remotes field in DESCRIPTION cf https://remotes.r-lib.org/articles/dependencies.html#github
Regarding the long lines for warnings, add ""# nolint" at the end cf https://github.com/jimhester/lintr#project-configuration
taxa is mostly aimed at developers, maybe a point for the comparison. cf https://github.com/ropensci/taxa/issues/201#issuecomment-543868257
Reg test coverage yes it is just at the limit at the moment. :wink:
:wave: @kamapu! Any update? Happy New Year. :slightly_smiling_face:
Happy new year @maelle I answer late, because I'm still working on it and was waiting to have good news for you:
1) Long code was solved by skipping some space from indentation and using paste()
for too long messages.
2) Coverage have been also improved a bit, but I am still struggling with some issues on this regard (see questions at the end).
3) Since I have to submit a new version of taxlist
to CRAN, I had to remove the functions taxlist2taxmap()
and taxmap2taxlist()
from master and will reinsert them, once I get news about a new CRAN version of taxa
, which is compatible with those functions.
I try to finish with the file README.md and come back to you. In the meantime some questions:
1) I don't have any clue about how to test writing functions, specifically in the case of backup_object()
, which is writing a .rda file. I was not able to find help on Internet (some advices were not comprehensive for my background).
2) The same is valid for functions printing in the console, for instance taxlist::summary()
.
Some hints for me?
Thanks for the updates!
I am sorry, I've just noticed you do not use roxygen2
for producing documentation, which is a requirement before review. I haven't done the conversion from Rd to roxygen2 myself but you might find the rd2roxygen package useful. When using roxygen2 you won't have to updated NAMESPACE by hand. See e.g. the chapter about man/ from the R Packages book and the chapter about NAMESPACE. I am really sorry for not seeing that earlier, but that change will be for the best.
Reg your questions
expect_output()
I'd even recommend using Markdown formatting for roxygen2 so you might need to run roxygen2md after Rd2roxygen. :-) (that part is not a requirement, I just think it's easier to use Markdown formatting later, but you can disagree).
Thank you, for your prompt answer. I just liked to announce you about the last version uploaded to master, to release me from the work, but I see, it is a never ending story...
I may have overseen the implementation of roxygen
in the source. I have read about it but never given a try, thus again, something new for me and more time to get there...
Regarding the test on writing function, were will be the file get written? in the active working directory, meaning the main folder of the package source? In a "virtual" (temporary) working directory?
You won't regret learning about roxygen2, for this package and future packages. :wink: It's fine if it takes time, I can imagine you're busy.
The reference file would be in the test/ folder, and the one created during test in a temporary directory. You can use the same code as in this helper file that'll create a temporary directory and delete it once the test is run. the check=TRUE
argument of tempdir()
was introduced in R 3.5.0.
Dear @maelle I managed to "roxygenize" the package. You are right, it makes a lot of sense to implement it, though it has been also very helpful to have the experience doing documentation by hand. What is next? :wink:
Awesome, congrats! I too first learned to write docs by hand but am not sure I wouldn't have preferred to be shown roxygen2 first π
A remaining point was "Insert comparison withΒ taxaΒ in README", have you made progress on that?
I tried but using a different approach then the one suggested:
1) The first paragraph in the chapter Similar Packages summarizes the differences between taxlist
and taxa
with a link to the detailed discussion.
2) I also inserted two chapters called Rmarkdown Integration and Descriptive Statistics that do not directly mention taxa
but attempt to highlight special features of taxlist
, which may not be considered in any other package dealing with taxonomic information.
Please, let me know if this is OK so or if some additions are required.
I also remind you, that the functions written to transform taxlist
objects into taxmap
and vice versa were removed from the master branch since they are not working with the current CRAN version of taxa
.
I will reinsert those functions once taxa
is accordingly updated.
Thank you. I'm not sure it's sufficient to answer the question "how would an R user with a taxonomy problem at hand know how to choose one of the two packages?", especially as it'd demand a potential user to read and digest information from a GitHub issue thread. We were especially hoping both packages would have the same summary/info in their README. @sckott @zachary-foster could you please comment on that + on when taxa will be updated on CRAN? Thanks all.
Furthermore, @kamapu, to help my understanding and my reviewer search when time comes, feel free to point me to use cases by other people. :-)
"how would an R user with a taxonomy problem at hand know how to choose one of the two packages?" ... We were especially hoping both packages would have the same summary/info in their README.
I am not really sure to be honest. Its been a bit since I tried out taxlist
. I might have to revisit it to answer that. @kamapu can correct me if I am wrong, but perhaps the main difference is that taxa
is targeted towards developers that want to make an R package that uses taxonomic data, but does not want to make classes and manipulation functions from scratch. taxa
is like tibble
+ dplyr
for taxonomic data. Non-developer users would interact with the classes and functions defined by taxa
, but would do so in the context of another package, much like tibbles are used in packages besides tibble
and users of tibbles might not even know which package defines them.
on when taxa will be updated on CRAN?
I will try to get an update out this week
β οΈβ οΈβ οΈβ οΈβ οΈ
In the interest of reducing load on reviewers and editors as we manage the COVID-19 crisis, rOpenSci is temporarily pausing new submissions for software peer review for 30 days (and possibly longer). Please check back here again after 17 April for updates.
In this period new submissions will not be handled, nor new reviewers assigned. Reviews and responses to reviews will be handled on a 'best effort' basis, but no follow-up reminders will be sent.
Other rOpenSci community activities continue. We express our continued great appreciation for the work of our authors and reviewers. Stay healthy and take care of one other.
The rOpenSci Editorial Board
β οΈβ οΈβ οΈβ οΈβ οΈ
@maelle I managed to produce a new version of taxlist
, which may consider all tasks requested for its submission to ROpenSci.
roxygen2
.covr
and coverage icreased to over 90%.taxlist2taxmap()
and taxmap2taxlist()
developed for the exchange of objects with taxa
.taxlist
vs. taxa
.On the later case, I wrote an itemized list in README, chapter Similar Packages.
Note that since I am not really working much with taxa
, I'm not able to provide a neutral comparison between the two packages. The same is valid for the developers of taxa
in the other direction. Thus I rather provide examples of applications, where the users may prefer or have to use taxlist
instead of taxa
.
Thanks @kamapu, it looks great, but I'll hold off looking for reviewers now as we extended the pause mentioned earlier until at least May the 7th. Thanks for your understanding!
@maelle Should I "freeze" the master branch or can I still do changes?
You can still do changes, just don't decrease the code coverage now that it's so good π We'll update threads when we are back to normal operation.
β οΈβ οΈβ οΈβ οΈβ οΈ In the interest of reducing load on reviewers and editors as we manage theβ¨COVID-19 crisis, rOpenSci new submissions for software peer review are paused.
In this period new submissions will not be handled, nor new reviewers assigned.β¨Reviews and responses to reviews will be handled on a 'best effort' basis, butβ¨no follow-up reminders will be sent. Other rOpenSci community activities continue.
Please check back here again after 25 May when we will be announcing plans to slowly start back up.
We express our continued greatβ¨appreciation for the work of our authors and reviewers. Stay healthy and takeβ¨care of one other.
The rOpenSci Editorial Board β οΈβ οΈβ οΈβ οΈβ οΈ
@kamapu we're back! Anything I should know (features planned or other things you want to tackle before review) before I start looking for reviewers?
Welcome back! This are good news and a signal that a piece of life is returning to normality. I have some problems with CRAN because the encoding of example data (Encoding is becoming my worse nightmare). I may solve it by the end of the week and produce a new release.
Summary
The
taxlist
package structures taxonomic information into S4 objects and implements methods for the manipulation of contained information. Such objects may or may not contain information on synonymy, taxonomic ranks, parent-child relations, taxon views (references used to establish relation between taxon usage names and taxon concepts), and taxon (functional) traits.https://github.com/kamapu/taxlist
Reproducibility, because this package makes taxonomic information available in a quasi-standard format and tests inconsistencies on the content of taxonomic lists.
In general to taxonomists and biodiversity scientists, in particular to vegetation ecologists (
taxlist
objects are implemented in the package vegtable).While its functionality may overlap the package taxa, the package
taxlist
attempts to be flexible in the degree of completeness of data (incompleteness is very frequent in vegetation-plot databases), it is meant to be integrated in objects containing diversity information (as in the mentioned packagevegtable
) and to import data from local storage (spreadsheets, Turboveg data sets and even PostgreSQL tables by using vegtable2).Requirements
Confirm each of the following by checking the box. This package:
Publication options
paper.md
matching JOSS's requirements with a high-level description in the package root or ininst/
.Detail
[x] Does
R CMD check
(ordevtools::check()
) succeed? Paste and describe any errors or warnings:[x] Does the package conform to rOpenSci packaging guidelines? Please describe any exceptions:
If this is a resubmission following rejection, please explain the change in circumstances:
If possible, please provide recommendations of reviewers - those with experience with similar packages and/or likely users of your package - and their GitHub user names:
@arendsee @zachary-foster @sckott