Closed jt14den closed 2 years ago
I think we can thread a good story through all those concepts: acquiring data (import file and/or use an API) -> making it tidy -> do some stats -> visualise it -> create an rmarkdown report -> publish with shiny.
This makes very good sense to me. I think your summary flowchart is an excellent good model.
i.e. acquiring data (import file and/or use an API) -> making it tidy -> do some stats -> visualise it -> create an rmarkdown report -> publish with shiny.
Some quick thoughts....
stringr
).One comment on Shiny. While I am a big fan, I also think your concern
about complexity (as you call it, cognitive load) is worth
thinking/talking through. One potential alternative (just thinking out
loud) is FlexDashboards/HTML Widgets. The primary reason is there's less
scaffolding (thank Shiny) and no need for shiny server. There are
drawbacks to this as well, but it's quick to market and simple. My
example: I teach about dynamic dashboards with flexdashboards and HTML
widgets (including crosstalk
[for linked brushing], leaflet
[maps],
DT
[for the linked data tables] ; Plotly
[because ggplotly
function
turns any ggplot object into an interactive vis]. Anyway, my thinking is
that I can often live without Shiny But the goal of presenting
interactive charts and maps via dashboards is still important and I think
easier via HTMLwidgets. There is a question of scale and optimization that
can be brought into this discussion (or ignored ;-). Hope that makes
sense.
On Tue, Jun 18, 2019 at 3:09 AM Stéphane Guillou notifications@github.com wrote:
- Creating a webapp with Shiny to publish an interactive (e.g. plotly) visualisation of stats (too much cognitive load maybe?)
- Deal with text: use regular expressions (with stringr for example).
- Play with APIs, which is a common use-case of R for accessing reference databases. For example with Web of Science: https://github.com/juba/rwos https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_juba_rwos&d=DwMCaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=GzJFj2K_TA2vdAdJOleIhUWB5To2gqJTyCKj-_QNqg8&m=jZ72MgSe1tNki_kLKBxBi6KVN9aY13e9ile3qZKvcsc&s=Yp1X2DXbtGJPa1Lidc9M_JdCtWV9Uz2SgJgLEqJ8JbA&e=
I think we can thread a good story through all those concepts: acquiring data (import file and/or use an API) -> making it tidy -> do some stats -> visualise it -> create an rmarkdown report -> publish with shiny.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_LibraryCarpentry_lc-2Dr_issues_1-3Femail-5Fsource-3Dnotifications-26email-5Ftoken-3DABBIEKXOQSI7FMYMIWV6EQ3P3CC3VA5CNFSM4HRRDGBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODX5NFNI-23issuecomment-2D502977205&d=DwMCaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=GzJFj2K_TA2vdAdJOleIhUWB5To2gqJTyCKj-_QNqg8&m=jZ72MgSe1tNki_kLKBxBi6KVN9aY13e9ile3qZKvcsc&s=nVK1H1wy9z1IHmJpSteMdUt-TB8TBY8b8Y7ue2qFzj0&e=, or mute the thread https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ABBIEKWF5LLUPMQZ3CAF6YDP3CC3VANCNFSM4HRRDGBA&d=DwMCaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=GzJFj2K_TA2vdAdJOleIhUWB5To2gqJTyCKj-_QNqg8&m=jZ72MgSe1tNki_kLKBxBi6KVN9aY13e9ile3qZKvcsc&s=HM3dHs6HxH3FzvmltuLk14Sn_zCPCUsNEjX4zRO33iA&e= .
-- John.Little@Duke.edu Data Science Librarian Data Visualization Services https://library.duke.edu/data/ - Duke Libraries https://JohnLittle.info/ https://johnlittle.info/ (919) 627-7795
@libjohn @stragu all this sounds good and ambitious :). Should we try to set up a call for next steps in chunking it up and come up with a plan? @libjohn my experience with the cross-ref data is that it doesn't lend well to doing basic viz in the form we have it in the OR lesson, but would love to see what you did with it in R? Can you share? I've used LA public library circ data for a ggplot2 & shiny lesson: https://ucla-data-archive.github.io/elag2018-shiny/. I've wanted to refactor into carpentries style, but haven't had time. I'm open for using whatever dataset as long as we can accomplish our goals and teach basic data & stats lit.
There might be some additional ideas here: https://libraryassessment.org/program/
@jt14den : I'm happy to share the repo I developed to document my exploration...
Let me preface by noting my repo are notes to myself, based on my initial exploration and a conversation with Chris. (So, all the bad grammar, poor spelling, and off the cuff thinking is there to jog my memory only.) When @stragu 's post that came through my email, I was riffing off of that repo/personal-thinking from a few weeks ago. Although, I'm sure I put in too many ideas for one workshop. But all this is a long winded way of saying if you have a process in mind, yes I'm happy to join a call that moves towards next steps and developing a workshop. Let's see what fits.
In direct answer to your question about how I visualized the crossref data, maybe The PDF document in that repo will give the quickest glimpse. Scroll way down to see the charts and graphs.
I should say, I'm rather agnostic on the value of the crossref data for learning R/Tidyverse. My purpose for using that data was only to familiarize myself with a successful Library Carpentry's module. OpenRefine seemed like a good hand hold for that. Alternatively, see my README (again, notes to myself) as perhaps a better example of how I was trying to think through the issue of an R workshop for librarians, and how to organize my thinking. Maybe there is something transferable in all of this.
But, bottom line, like you I don't actually feel all that strongly about the crossref data. In fact, your idea of using circ data potentially generates broader interest and is maybe more inline with the link @libcce sent about the upcoming assessment conference. Looking through that Assessment conference program, I see those folks intend to analyze circ data as well -- only using Tableau.
Hey @stragu & @jt14den .
I'm writing to find out if you remain interested in jump starting this effort on intro R. @jt14den previously suggested a call and maybe there is still some enthusiasm for this?
If you are interested, I can launch a zoom next Wednesday (US - July 17) / Thursday ( AT - July 18): 6pm my time, 8am Brisbane, & 3pm LA).
Please give me a quick reply if you're interested or want to offer an alternative time/approach. Totally open to other options. Very interested to chat more with either of you. @libcce also.
Since my last post I've been slowly plugging away -- exploring datacarpentry/r-socialsci, took a closer look at Tim's work visualizing LA Public Library circ data (very nice), looked at some San Francisco Public Library circ data available on data.gov, and am also recently inspired by Mine Cetinkaya-Rundel's course design pedagogy, Data Science in a Box. All my notes and scribbles are in my repo.
Hi!
I don't know if there has been much movement around the introduction to R lesson for the Library Carpentry in the past months, but I think one package that should be addressed here is bibliometrix. The activities to develop with this package can go together with other points mentioned before in this A possible path:
Acquire a dataset: you have already mentioned Crossref API, perhaps also DataCite API and Dimensions API. These are all APIs interesting for the library world. Otherwise, Web of Science can be an alternative to get a dataset (not a big fan of it though, since it's private and has a coverage bias).
Make sure that the data set is in a .bib format to use it correctly with bibliometrix
Prepare bibliometrix to read the dataset/s
Make some visualisations / use bibliometrix for co-citation analysis.
I hope this makes sense :-)
Best,
Paloma
I’ve been hoping for a lesson around bibliometrics @pmarrai 😀 We definitely need a lead on lesson dev. I think it’s been difficult for anyone to find the time to move it forward.
Great ideas here @pmarrai https://github.com/pmarrai. I have some content I’ve been getting cleaned up based on this workshop https://ciakovx.github.io/fsci_syllabus.html. I’ll try and get it posted here in the next couple weeks and we can dig into it.
I haven’t used bibliometrix. Another useful package is Citicorp https://ropensci.org/technotes/2019/09/17/citecorp/
I also have been using rromeo lately with researcher CVs to look ups hat they can deposit in the IR. https://ropensci.github.io/rromeo/articles/rromeo.html
Lots of potential here!
Clarke
On Thu, Oct 24, 2019 at 05:41 Chris Erdmann notifications@github.com wrote:
I’ve been hoping for a lesson around bibliometrics @pmarrai https://github.com/pmarrai 😀 We definitely need a lead on lesson dev. I think it’s been difficult for anyone to find the time to move it forward.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LibraryCarpentry/lc-r/issues/1?email_source=notifications&email_token=ABTDILXOAZEFNSVQXVVIVIDQQF3UHA5CNFSM4HRRDGBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECESLZY#issuecomment-545859047, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABTDILQBVUUNYQ5XWJV262LQQF3UHANCNFSM4HRRDGBA .
Great! Then I'd be glad to participate if you all move it forward :-)
Hi all, I just ran across across the Official "Mastering the Tidyverse" repo (includes slides and Rmd files) that Rstudio makes available as part of their certified training -- in other places of their certification notes they reference contributions by the Carpentries. Maybe the Mastering repo will inspire you as it is inspiring me. Always great to see the work of an accomplished instructor like Garrett Grolmund. Check it out https://github.com/rstudio/master-the-tidyverse
Good work and good luck
John
Closing this issue.
An opening brainstorming issue on what this lesson might contain. Please pitch in!
Here are some things I've thought about it.
tidyverse
-dplyr
&tidyr
ggplot2
rmarkdown