Triamus / play

play repo for experiments (mainly with git)
1 stars 0 forks source link

unsorted #21

Open Triamus opened 6 years ago

Triamus commented 6 years ago

Others

the-current-state-of-applied-data-science get this book Data Computing An Introduction to Wrangling and Visualization in R https://leanpub.com/data-computing

get this book Modern Data Science with R baumer, kaplan, horton

get this book A Data Engineer's Manual joseph clark

A Layered Grammar of Graphics http://vita.had.co.nz/papers/layered-grammar.pdf

ggmap: Spatial Visualization with ggplot2 https://journal.r-project.org/archive/2013-1/kahle-wickham.pdf

https://bookdown.org/yihui/bookdown/bookdown.pdf

http://heather.cs.ucdavis.edu/~matloff/Python/PLN/FastLanePython.pdf Fast Lane to Python book matloff

writing R functions http://www.brodrigues.co/fput/ http://www.stat.cmu.edu/~cshalizi/402/programming/writing-functions.pdf https://www.r-bloggers.com/how-to-write-and-debug-an-r-function/ https://www.bioconductor.org/help/course-materials/2013/CSAMA2013/friday/afternoon/R-programming.pdf

lending club

        http://www.lendingmemo.com/lending-club-prosper-default-rates/

        https://res.cloudinary.com/general-assembly-profiles/image/upload/v1416535475/uwumoooppttsmpgu1goo.pdf

        http://blog.nycdatascience.com/r/p2p-loan-data-analysis-using-lending-club-data/

        https://blog.nycdatascience.com/student-works/data-visualization-lending-club-issued-loans/

        https://rstudio-pubs-static.s3.amazonaws.com/115829_32417d32dbce41eab3eeaf608a0eef9d.html

        https://rstudio-pubs-static.s3.amazonaws.com/203258_d20c1a34bc094151a0a1e4f4180c5f6f.html

        ggplot templates          http://zevross.com/blog/2014/08/04/beautiful-plotting-in-r-a-ggplot2-cheatsheet-3/

        http://trendct.github.io/data/2016/05/lending-club/

        http://docs.ggplot2.org/current/geom_map.html

        http://eriqande.github.io/rep-res-web/lectures/making-maps-with-R.html

        http://www.datasciencecentral.com/profiles/blogs/analysis-of-lending-club-s-data

        https://shuhelicopter.github.io/Data_Exploratory_Analysis_of_LC.html

        http://cs229.stanford.edu/proj2015/199_report.pdf

r lists

https://jennybc.github.io/purrr-tutorial/bk00_vectors-and-lists.html

http://r4ds.had.co.nz/lists.html

http://stackoverflow.com/questions/7481522/how-do-i-loop-in-a-list-and-access-both-names-and-attributes

http://stackoverflow.com/questions/31561238/lapply-function-loops-on-list-of-lists-r

Linux commands

du -hs * 2>/dev/null | sort -hr

du -hs * 2>/dev/null | sort -hr | tee -a ./file.txt

printf "\n$(date)\n$(du -hs * 2>/dev/null | sort -hr)" | tee -a ./file.txt

/export/gcs1/data/prod/gwst/_share ls --sort=size -l

find . -name "*.sas" -exec grep -il flgdata {} \;

find . -name ".gitconfig"

mandat

https://cran.r-project.org/web/packages/dat/index.html dat: Tools for Data Manipulation

https://cran.r-project.org/web/packages/rpivotTable/index.html rpivotTable: Build Powerful Pivot Tables and Dynamically Slice & Dice your Data https://github.com/smartinsightsfromdata/rpivotTable

https://github.com/MichaelChirico/funchir

https://www.kaggle.com/general/25058

https://github.com/ben519

http://stackoverflow.com/questions/32089594/optimal-way-in-data-table-to-make-multiple-columns-from-vectors-of-column-name-s

https://libraries.io/github/jangorecki/big.data.table

https://gitlab.com/jangorecki/big.data.table

http://stackoverflow.com/questions/41767694/get-column-names-when-summarising-a-data-table-using-multiple-functions-in-r

http://juliasilge.com/blog/Beginners-Guide-to-Travis/

http://stackoverflow.com/questions/22568956/how-can-i-compute-statistics-by-decile-groups-in-data-table/39560604#39560604

https://github.com/ben519/mltools/blob/master/R/bin_data.R

https://gormanalysis.com/

http://stackoverflow.com/questions/10527072/using-data-table-package-inside-my-own-package/10529888#10529888

https://github.com/stephlocke/Rtraining/blob/master/inst/handouts/fundamentals/tablewrangling.Rmd

http://stackoverflow.com/questions/36287450/using-pre-defined-default-variable-names-within-dplyr-utility-functions

http://kbroman.org/pkg_primer/pages/tests.html

https://raw.githubusercontent.com/juliasilge/tidytext/master/README.md

data profiling

        http://de.slideshare.net/michellekolbe/data-profiling-with-r

        https://cran.r-project.org/web/packages/psych/index.html

        https://www.quora.com/What-are-some-of-the-best-R-packages-for-data-profiling-or-other-stand-alone-open-source-data-profiling-software

        https://hpi.de/fileadmin/user_upload/fachgebiete/naumann/publications/2016/ICDE_2016_DataProfilingTutorial_CompleteSlides.pdf

        https://www.linkedin.com/pulse/r-data-profiling-cleansing-kannan-kalidasan

        https://www.analyticsvidhya.com/blog/tag/data-profiling/

        https://rpubs.com/bbolker

        http://www.columbia.edu/~sjm2186/EPIC_R/EPIC_R_BigData.pdf

        http://files.meetup.com/1225993/Porzak%20-%20Data%20Profiling%20with%20R%20%26%20MySQL%20-%20July%202011.pdf

        http://datascienceheroes.com/

packages

        https://cran.r-project.org/web/packages/funModeling/           data profiling in particular function df_status, equal_freq

        https://cran.rstudio.com/web/packages/rWind/

        https://cran.rstudio.com/web/packages/pipeliner/

        https://cran.rstudio.com/web/packages/tidyquant/

        !!! Hmisc, in particular data profiling and binning function (cut2)

        !!! DescTools: Tools for Descriptive Statistics           https://cran.r-project.org/web/packages/DescTools/index.html

        broom

        psych

        tidyquant         https://www.r-bloggers.com/tidyquant-0-3-0-ggplot2-enhancements-real-time-data-and-more/

        multideplyr      https://www.r-bloggers.com/speed-up-your-code-part-2-parallel-processing-financial-data-with-multidplyr-tidyquant/

        vtreat   https://cran.r-project.org/web/packages/vtreat/index.html

diff between tables (sas proc compare)

        http://stackoverflow.com/questions/28056805/finding-discrepancies-between-two-tables-in-r

tools/techniques

        data.table in shiny - fast lookups        https://www.r-bloggers.com/fast-data-lookups-in-r-dplyr-vs-data-table/

        examples on exploratory analysis with ggplot + sparklyr       https://www.r-bloggers.com/predicting-food-preferences-with-sparklyr-machine-learning/

        summary stats for list of variables      http://stackoverflow.com/questions/31258547/data-table-row-wise-sum-mean-min-max-like-dplyr

        data quality framework with R           http://ians-oracle.blogspot.de/2016/05/data-quality-using-r.html

misc

https://www.datacamp.com/community/tutorials/functions-in-r-a-tutorial

http://www.r-bloggers.com/how-to-write-good-tests-in-r/

http://www.r-bloggers.com/the-user-2016-tutorials/

http://www.r-bloggers.com/boost-your-data-munging-with-r/

http://www.r-bloggers.com/express-intro-to-dplyr/

http://www.r-bloggers.com/dplyr-0-5-0/

http://www.r-bloggers.com/intro-to-the-data-table-package/

http://www.r-bloggers.com/tidyr-0-5-0/

https://rollingyours.wordpress.com/2016/06/14/fast-aggregation-of-large-data-with-the-data-table-package/

http://www.r-bloggers.com/dplyr-do-some-tips-for-using-and-programming/

http://www.r-bloggers.com/giving-back-with-code/

http://www.codenewbie.org/

http://learnprogramming.github.io/

http://stat545.com/

http://www.r-bloggers.com/data-frame-columns-as-arguments-to-dplyr-functions/

http://stackoverflow.com/questions/28973056/in-r-pass-column-name-as-argument-and-use-it-in-function-with-dplyrmutate-a

http://stackoverflow.com/questions/26492280/non-standard-evaluation-nse-in-dplyrs-filter-pulling-data-from-mysql

http://www.r-bloggers.com/the-mathematics-of-machine-learning/

https://www.datacamp.com/courses/writing-functions-in-r/?tap_a=5644-dce66f&tap_s=10907-287229

http://blog.osteele.com/posts/2008/05/my-git-workflow/

http://2ndscale.com/rtomayko/2008/the-thing-about-git

https://www.atlassian.com/git/tutorials/comparing-workflows/centralized-workflow

http://stackoverflow.com/questions/21435339/data-table-vs-dplyr-can-one-do-something-well-the-other-cant-or-does-poorly/27840349

read again on Non-standard evaluation http://adv-r.had.co.nz/Computing-on-the-language.html

debugging in rstudio https://support.rstudio.com/hc/en-us/articles/205612627-Debugging-with-RStudio http://adv-r.had.co.nz/Exceptions-Debugging.html

A new data processing workflow for R: dplyr, magrittr, tidyr, ggplot2 http://zevross.com/blog/2015/01/13/a-new-data-processing-workflow-for-r-dplyr-magrittr-tidyr-ggplot2/

https://ttvand.github.io/Winning-approach-of-the-Facebook-V-Kaggle-competition/

https://blog.dominodatalab.com/using-r-and-python-for-common-sas-functions/

https://cran.r-project.org/web/packages/DataCombine/

scriptSearch function https://cran.r-project.org/web/packages/BurStMisc/BurStMisc.pdf

https://www.r-bloggers.com/plotting-background-data-for-groups-with-ggplot2/

http://livebook.datascienceheroes.com/

http://stackoverflow.com/questions/27677283/evaluating-both-column-name-and-the-target-value-within-j-expression-within-d

http://yihui.name/printr/