r-hyperspec / r-hyperspec.github.io

Homepage for r-hyperspec ecosystem
https://r-hyperspec.github.io
0 stars 0 forks source link

Scheme of r-hyperspec packages #5

Closed GegznaV closed 3 years ago

GegznaV commented 4 years ago

For me, it is a bit unclear, where we are going to with this project and how it should look at the end of this summer. The vision in the form of a scheme/flowchart would be helpful. The scheme should contain the names of r-hyperspec family packages and other non-package repositories we are going to create and the dependencies between them. The scheme may change later.

And I'm preparing a draft scheme which could be a starting point for the discussion.

GegznaV commented 4 years ago

image

GegznaV commented 4 years ago

@r-hyperspec/r-hyperspec These schemes are for discussion on Wednesday's meeting. They show how I understand our vision of r-hyperspec family (two versions of the vision are presented). Please, study and prepare your suggestions.


UPDATE: I updated the schemes and moved them to the message below.


The graphs are implemented with GraphViz via DiagrammeR in RStudio. Sources: r-hyperspec-schemes-GraphViz.zip Unzip, open in RStudio and press "Preview" button. Modify and press "Preview" again.

eoduniyi commented 4 years ago

Great job @GegznaV

bryanhanson commented 4 years ago

A couple of comments.

GegznaV commented 4 years ago

I corrected some issues in the schemes.

Scheme 1 ![image](https://user-images.githubusercontent.com/12725868/87557771-000c1100-c6c1-11ea-9bc0-16b9ce58a1ab.png)
Scheme 2 ![image](https://user-images.githubusercontent.com/12725868/87558457-cdaee380-c6c1-11ea-83ff-aedd57f015c6.png)
Legend **Legend** Font color: - Black: already implemented packages - Red: not implemented yet - Green: non-package repos Lines/Arrows: - red: automatic relationship via CI. - blue: package dependencies (e.g., via "imports") - dashed purple: package dependencies (e.g., via "suggests"; only if installed on the user's computer): a. destination package is used to load other installed packages), b. This destination package may also reexport functions from the other packages.
cbeleites commented 4 years ago

Here's my proposal, edited from @GegznaV's list on slack:

Main package

Bridge packages

... connect hyperSpec with other packages where interaction does not work automatically. They can go on CRAN since we don't need huge test data sets.

Data packages:

Helper/Utility packages:

Helper GH repos:


Input/Output packages:

This is where things are more complicated...

If we want to cut down dependencies (https://github.com/cbeleites/hyperSpec/issues/215), at least some file import packages should go by file format rather than manufacturer:

There are import filters that do not add dependencies for binary formats:

These two file formats are sufficiently widespread and well-known that I believe they should each go into its own package.

There are import filters for a large variety of ASCII/text based formats:

Should these be bundled into, say, hySpc.read.txt?

Last but not least, there is a number of file formats where we have example data but no import functions yet. At least some of them will have their own dependencies.


@bryanhanson, @GegznaV , @eoduniyi, @ximeg: What do you think:

It may be better to have the file import packages named consistently and have them all by file format name. This would mean that we drop hySpc.read.Witec (or rather, rename it into hySpc.read.txt). We have e.g. several manufacturers exporting in Thermo Galactic .spc format, and their files are slightly different so we have not only read.spc(), but also read.spc.KaiserMap() etc. Putting the latter into a package hySpc.read.Kaiser would have that package depending on hySpc.read.spc which I'd like to avoid.

ximeg commented 3 years ago

As long as the end user can easily install all r-hyperSpec packages and easily (automatically) load all of them, we can split the file format function between packages however we want. It is important to remove this burden from the end user. I like and support the idea to do the packaging based on the dependencies, trying to minimize them.

My point is that as an end user (data analyst/spectroscopist) I want to be able to

library(hySpc.ggplot2)
library(hySpc.chondro)
library(hySpc.matrixStats)
library(hySpc.baseline)
library(hySpc.read.ENVI)
library(hySpc.read.spc)
library(hySpc.read.txt)

# Now I can finally write a line of code that reads a file, subtracts a baseline, and makes a plot
...
???
# Wait, I forgot to load a package that provides the `filter()` function...
# What was its name? ... Google it... Ah, `dplyr`
library(hySpc.dplyr)
...
# works!
eoduniyi commented 3 years ago

vision-model

@GegznaV this is still useful:

Screen Shot 2020-07-18 at 3 47 26 AM

via RGSOC_2020_Proposal

io

@cbeleites It sounds like the hySpc.read.Witec will be turned into hySpc.read.txt, which means this will be a larger package that supports import filters: Witec, Reinshaw, Andor, PerkinElmer, and Horiba. The remaining file io packages will support reading spectra data from: MATLAB, Winspec, Shimandzu?, and JCAMP.DX. Additionally, dedicated packages for ENVI and spc.

ux/ui

@ximeg I totally agree with you on this; I wonder about other ways to support the friendliness/experience for typical spectroscopic work.

maintainability

@bryanhanson I think the documentation on functionality and contributing/style has made it easier to maintain

eoduniyi commented 3 years ago

From the perspective of time I think we've gotten more specific about the implementation details: @cbeleites 2011 hyperSpec figure -> RGSOC_2020 figures -> @GegznaV figures

bryanhanson commented 3 years ago

Closing, as we have pretty much settled on a naming scheme and the issue is old.