Triamus / play

play repo for experiments (mainly with git)
1 stars 0 forks source link

general notes #20

Open Triamus opened 6 years ago

Triamus commented 6 years ago

Links to explore

https://github.com/Azure/mmlspark/blob/master/README.md

https://rviews.rstudio.com/2017/09/20/dashboards-with-r-and-databases/

https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-r-server-get-started

https://www.amazon.com/Python-Machine-Learning-scikit-learn-TensorFlow/dp

HdInsight doc https://docs.microsoft.com/en-us/azure/hdinsight/

http://www.storybench.org/getting-started-data-visualization-r-using-ggplot2/

http://seankross.com/2017/09/17/Enough-Docker-to-be-Dangerous.html

https://github.com/rhiever/Data-Analysis-and-Machine-Learning-Projects/blob/master/example-data-science-notebook/Example%20Machine%20Learning%20Notebook.ipynb

https://azure.microsoft.com/en-us/blog/predictive-maintenance-using-pyspark/

https://azure.microsoft.com/en-us/blog/diving-deep-into-what-s-new-with-azure-machine-learning/

https://eddjberry.netlify.com/post/writing-your-thesis-with-bookdown/

http://blog.revolutionanalytics.com/2017/09/news-from-ignite.html

http://datascienceathome.podbean.com/e/parallelizing-and-distributing-stochastic-gradient-descent/

http://www.randalolson.com/2014/06/28/how-to-make-beautiful-data-visualizations-in-python-with-matplotlib/

https://azure.microsoft.com/en-us/training/learning-paths/azure-ai-developer/

https://www.forbes.com/sites/quora/2017/09/06/ten-things-everyone-should-know-about-machine-learning/

http://tomaugspurger.github.io/scalable-ml-02.html

Rstas reprex package for reproducible example.

https://www.slideshare.net/mobile/arunkejariwal/modern-realtime-streaming-architectures?twitter=@bigdata

Cross-check with my spark post https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-tips-and-tricks-running-spark-windows.html

http://blog.revolutionanalytics.com/2017/09/microsoft-ml-server-92.html

https://azure.microsoft.com/en-us/training/learning-paths/azure-solution-architect/

Latex photo https://github.com/jonocarroll/mathpix

http://htmlpreview.github.io/?https://github.com/brodieG/vetr/blob/master/extra/compare.html

Azure documentation https://docs.microsoft.com/en-us/azure/

http://training.play-with-docker.com/

Sap hana express also on Docker https://www.sapstore.com/solutions/99055/SAP-HANA%2C-express-edition

Sap data hub ?

https://azure.microsoft.com/en-us/blog/reference-architecture-for-sap-netweaver-and-sap-hana-on-azure/

https://blogs.msdn.microsoft.com/azurecat/2017/08/25/sap-hana-on-azure-large-instances-setup-new-whitepaper/

https://docs.microsoft.com/en-us/azure/architecture/reference-architectures/sap/

http://www.datasciencecentral.com/profiles/blogs/27-free-data-mining-books

Shiny tutorial https://www.dropbox.com/s/rjt6g3ctdqvihat/shiny-quickstart-1.zip?dl=0 https://www.dropbox.com/s/ebvcqlwhmbr2625/How-to-start-2.zip?dl=0 https://www.dropbox.com/s/3043uj9vckykr5l/How-to-start-3.zip?dl=0

https://github.com/jennybc/docker-why/blob/master/README.md

Chest xray pics wang chest xray 8 from NIH https://t.co/4ZPb8mQIh2?amp=1

https://youtu.be/tW1JV6bHXFA

Visual studio for rpi. Remote debugging.

https://stackoverflow.blog/2017/09/29/making-remote-work-behind-scenes/

https://medium.com/applied-data-science/new-r-package-the-xgboost-explainer-51dd7d1aa211

https://juliasilge.com/blog/tidytext-0-1-4/

http://www.datasciguide.com/recommended-resources-for-beginners/

http://shiny.rstudio-staging.com/articles/understanding-reactivity.html

Microsoft container https://m.youtube.com/watch?v=BPMOqONNMNU

http://jov.arvojournals.org/article.aspx?articleid=2504104&utm_content=buffer13643&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer

https://t.co/SjqC1AEMkA?amp=1

Blogdown tutorial http://notes.peter-baumgartner.net/tutorial/

http://go.anodot.com/building-large-scale-wp-part-1

http://www.scipy-lectures.org/

http://blog.kaggle.com/2017/10/05/data-science-101-sentiment-analysis-in-r-tutorial/

https://learnpythonthehardway.org/book/appendixa.html

https://gigadom.wordpress.com/2017/10/06/practical-machine-learning-with-r-and-python-part-1/

Spark packages

Energy data analytics lab @DukeUEnergy

https://rachaellappan.github.io/bookdown/

https://jdblischak.github.io/workflowr/index.html

https://www.ramp.studio/

http://colinfay.me/writing-r-extensions/

Computer age statistical... Hastie https://t.co/PzHZpVpi7e?amp=1

An Introduction to Rocker: Docker Containers for R https://arxiv.org/abs/1710.03675

https://docs.microsoft.com/en-us/azure/sql-database/sql-database-performance-guidance?wt.mc_id=AID627566_QSG_SCL_190008

Tutorial: Azure Data Lake analytics with R http://blog.revolutionanalytics.com/2017/10/adla-with-r.html

Create your first function using Visual Studio https://docs.microsoft.com/en-us/azure/azure-functions/functions-create-your-first-function-visual-studio

Discover Cosmos DB, the globally distributed database transforming modern data management. https://info.microsoft.com/azure-build-modern-apps-at-global-scale-register.html

Docker for R Package Development http://www.jimhester.com/2017/10/13/docker/

Tidyeval preez https://t.co/aGbkrqwYAi?amp=1

RSelenium http://www.petrkeil.com/?p=2886 https://youtu.be/hDXY6Tco2JU http://zevross.com/blog/2015/05/19/scrape-website-data-with-the-new-r-package-rvest Maelle salmon blog did she write sth? http://brooksandrew.github.io/simpleblog/articles/scraping-with-selenium/ https://rud.is/b/2017/02/09/diving-into-dynamic-website-content-with-splashr/amp/ https://www.r-bloggers.com/how-successful-can-an-r-meetup-be-meetr-in-tricity-rselenium-and-big-data-processing-2/amp/

http://randomekek.github.io/deep/deeplearning.html

https://sux13.github.io/DataScienceSpCourseNotes/

https://docs.microsoft.com/en-us/vsts/

https://www.opencpu.org/posts/opencpu-with-docker/

https://www.r-bloggers.com/a-newbies-install-of-keras-tensorflow-on-windows-10-with-r/amp/

https://docs.microsoft.com/en-us/

https://stackoverflow.blog/2017/10/17/power-calculations-p-values-ab-testing-stack-overflow/

https://nats-www.informatik.uni-hamburg.de/SWC/

https://www.packtpub.com/big-data-and-business-intelligence/pandas-cookbook

https://machinelearningmastery.com/prepare-text-data-machine-learning-scikit-learn/

https://www.r-bloggers.com/is-it-faster-to-take-a-bike-or-taxi-in-nyc/amp/

Opensensors.io

https://htmlpreview.github.io/?https://github.com/brodieG/oshka/blob/master/inst/doc/nse-fun.html#an-ersatz-data.table

https://bookdown.org/Tazinho/Tidyverse-Cookbook/

Binderhub and jupyterhub

http://www.r2d3.us/visual-intro-to-machine-learning-part-1/

Azure serverless apps https://m.youtube.com/watch?v=aDBLh1Wf0L0

https://gigadom.wordpress.com/2017/10/20/practical-machine-learning-with-r-and-python-part-3/

7 properties of device security azure iot. Seen youtube ms ignite.

Microsoft uwp shop analytics example Rpi

Data munging with R carrol book https://t.co/H3lB3riowu?amp=1

Think Like a Data Scientist https://medium.com/towards-data-science/how-to-choose-statistical-software-tools-4870dd3c92a0

http://lockedata.uk/power-bi-painpoints/

https://www.edgarsdatalab.com/2017/10/22/intro-to-tensorflow-in-r/

https://rud.is/b/2017/10/22/a-call-to-tweets-blog-posts/amp/

http://blog.revolutionanalytics.com/2017/10/statistical-machine-learning-with-microsoft-ml.html

https://fronkonstin.com/2017/10/24/a-shiny-app-to-create-sentimental-tweets-based-on-project-gutenberg-books/

Visual studio dev essentials free program

https://github.com/lockedata/pres-powerbi

https://blogs.msdn.microsoft.com/visualstudio/2017/10/26/run-book-run-from-physical-paper-to-executable-online-books/

https://medium.com/@keeper6928/how-to-unit-test-machine-learning-code

https://www.datavizualization.datasciencecentral.com/blog/outlier-detection-with-parametric-and-non-parametric-methods

https://www.r-bloggers.com/not-mustard-exploring-mcdonalds-reviews-on-yelp-with-r/amp/

http://robinlovelace.net/geocompr/

https://www.datasciencecentral.com/profiles/blogs/comprehensive-repository-of-data-science-and-ml-resources

https://www.datasciencecentral.com/profiles/blogs/10-free-machine-learning-books

http://www.listendata.com/p/r-programming-tutorials.html?m=1

http://www.business-science.io/code-tools/2017/10/28/demo_week_h2o.html

Carnegie mellon mario berges smarter infrastructure analytics lab. Also see enes hosgor.

https://www.datasciencecentral.com/profiles/blogs/the-mathematics-of-machine-learning

https://www.dataquest.io/home become a data scientist

https://medium.freecodecamp.org/every-single-machine-learning-course-on-the-internet-ranked-by-your-reviews

https://tutorials.ubuntu.com/tutorial/tutorial-windows-ubuntu-hyperv-containers?backURL=/#1

https://databricks.com/blog/2017/10/30/introducing-vectorized-udfs-for-pyspark.html

https://www.analyticsvidhya.com/blog/2016/01/guide-data-exploration/

http://scikit-learn.org/stable/supervised_learning.html#supervised-learning

https://www.inwt-statistics.com/read-blog/promises-and-closures-in-r.html

http://nbviewer.jupyter.org/github/bckenstler/dsb17-walkthrough/blob/master/Part%201.%20DSB17%20Preprocessing.ipynb

https://raybuhr.github.io/2017/10/making-predictions-over-http/

https://github.com/databricks/spark-deep-learning

https://stackoverflow.com/questions/45101045/why-use-purrrmap-instead-of-lapply/47123420#47123420

https://medium.freecodecamp.org/want-to-know-how-deep-learning-works-heres-a-quick-guide-for-everyone-1aedeca88076

Tink.de

https://blogs.technet.microsoft.com/machinelearning/2017/05/02/end-to-end-scenarios-enabled-by-the-data-science-virtual-machine-video/

https://www.datasciencecentral.com/profiles/blogs/comprehensive-repository-of-data-science-and-ml-resources

https://stackoverflow.com/questions/47231875/non-standard-subsetting-of-data-frames

http://veekaybee.github.io/2017/09/26/python-packaging/

http://www.dataperspective.info/2017/11/information-retrieval-document-search-using-vector-space-model-in-r.html?m=1

http://www.unofficialgoogledatascience.com/2016/10/practical-advice-for-analysis-of-large.html?m=1

http://blog.revolutionanalytics.com/2017/11/azure-learning-plans.html

https://www.datasciencecentral.com/profiles/blogs/100-commonly-asked-data-science-interview-questions

http://ropenscilabs.github.io/r-docker-tutorial/

https://rviews.rstudio.com/2017/11/15/shiny-and-scheduled-data-r/

https://stanfordmlgroup.github.io/projects/chexnet/

Human Microbiota and Ophthalmic Disease https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5045141/

http://www.katharineegan.com/blog/international-coastal-cleanup-data-exploration-by-home-states-and-territories-2008-2015

https://www.amazon.com/Mastering-Feature-Engineering-Principles-Techniques/dp/1491953241

https://dcl-2017-04.github.io/curriculum/upcoming.html

https://www.causeweb.org/cause/webinar/activity/2017-11

https://datascienceplus.com/predict-customer-churn-logistic-regression-decision-tree-and-random-forest/

https://blog.codecentric.de/en/2017/11/explore-predictive-maintenance-flexdashboard/

Flight tracking realtime Adsbexchange.com

https://www.r-bloggers.com/how-to-scrape-pdf-and-rmd-to-get-inspiration/amp/

Shinyproxy

Philipp kuhn chardonney

http://blog.revolutionanalytics.com/2017/11/doazureparallel-containers.html

Opencpu on docker try

https://www.jaclynkokx.com/single-post/2017/11/17/Predicting-Which-Water-Pumps-Might-Fail?_amp_=true

http://usblogs.pwc.com/emerging-technology/machine-learning-methods-infographic/

http://blog.otoro.net/2017/10/29/visual-evolution-strategies/

The German Traffic Sign Detection Benchmark http://benchmark.ini.rub.de/?section=gtsdb&subsection=dataset

https://www.r-bloggers.com/rule-your-data-with-tidy-validation-reports-design/amp/

http://flowingdata.com/2017/01/24/one-dataset-visualized-25-ways/

https://blogs.technet.microsoft.com/machinelearning/2017/09/25/using-the-team-data-science-process-tdsp-in-azure-machine-learning

http://colinfay.me/purrr-text-wrangling/

Data engineering podcast

Moderat elecronica

Ubuntupodcadt.org

https://longhowlam.wordpress.com/2017/12/10/the-i-love-ikea-web-app-build-at-the-ikea-hackaton-with-r-and-shiny/

http://www.masalmon.eu/2017/12/11/goodrpackages/

https://christophm.github.io/interpretable-ml-book/

https://rviews.rstudio.com/2017/12/11/r-and-tensorflow/

https://github.com/h2oai/h2o-tutorials/tree/master/h2o-world-2017/automl

Machine learning tutorial slides https://docs.google.com/presentation/d/1kSuQyW5DTnkVaZEjGYCkfOxvzCqGEFzWBy4e9Uedd9k/mobilepresent?slide=id.g2397597de6_0_0

http://blog.revolutionanalytics.com/2017/12/r-in-the-windows-subsystem-for-linux.html

https://appsilondatascience.com/blog/rstats/2017/10/17/scaling-shiny.html

Google machine learning Tutorial https://docs.google.com/presentation/d/1kSuQyW5DTnkVaZEjGYCkfOxvzCqGEFzWBy4e9Uedd9k/mobilepresent?slide=id.g168a3288f7_0_58

Data science statistics terminology https://ubc-mds.github.io/resources_pages/terminology/

Knowledge Management, Document management:

https://medium.com/@SoftClouds/knowledge-management-vs-document-management-17700986a51a

http://tdan.com/framework-for-managing-knowledge-content-and-documents/21065

https://www.atlassian.com/blog/archives/document_manage/amp

https://www.noggle.online

https://www.researchgate.net/publication/267966411_A_STRATEGY_FOR_SEMANTIC_DOCUMENT_CLASSIFICATION_IN_AN_ONTOLOGY-DRIVEN_KNOWLEDGE_MANAGEMENT_SYSTEM/amp

http://www.nltk.org/book/ch06.html

http://zacstewart.com/2015/04/28/document-classification-with-scikit-learn.html

https://towardsdatascience.com/machine-learning-nlp-text-classification-using-scikit-learn-python-and-nltk-c52b92a7c73a

https://www.python-course.eu/text_classification_python.php

https://www.quantstart.com/articles/Supervised-Learning-for-Document-Classification-with-Scikit-Learn

https://stackoverflow.com/questions/40826144/classifying-text-documents-using-nltk

https://www.springboard.com/blog/text-mining-in-r/

http://r-posts.com/how-to-extract-data-from-a-pdf-file-with-r/

https://slides.yihui.name/xaringan/#39

https://docs.microsoft.com/en-us/azure/machine-learning/preview/scenario-document-collection-analysis

https://cran.r-project.org/web/packages/text2vec/vignettes/text-vectorization.html

https://lukeoakdenrayner.wordpress.com/2017/12/18/the-chestxray14-dataset-problems

Joel grus adventcode live coding

https://www.analyticsvidhya.com/blog/2015/09/hypothesis-testing-explained/

https://github.com/Jam3/math-as-code

https://bookdown.org/baydap/bookdownplus/

https://bookdown.org/yihui/blogdown/

https://www.pyimagesearch.com/2017/12/18/keras-deep-learning-raspberry-pi/

Bdb podcast

http://style.tidyverse.org/

https://www.dominodatalab.com/resources/managing-data-science/

https://caitlinhudon.com/2017/12/22/blue-christmas

https://towardsdatascience.com/a-zero-math-introduction-to-markov-chain-monte-carlo-methods-dcba889e0c50

Data science at home podcast

http://www.questionflow.org/2017/12/26/combined-outlier-detection-with-dplyr-and-ruler/

Talk python podcast

http://usblogs.pwc.com/emerging-technology/machine-learning-evolution-infographic/

Interpretable Machine Learning https://christophm.github.io/interpretable-ml-book/

https://www.datasciencecentral.com/profiles/blogs/50-national-open-data-plateforms

http://www.wzchen.com/probability-cheatsheet/

Introduction to probability https://www.amazon.com/gp/product/1466575573/ref=as_li_tl?ie=UTF8&camp=1789&creative=390957&creativeASIN=1466575573&linkCode=as2&tag=datascientist-20&linkId=DMC64XQVG4QHMHVQ

Wickham Awesome. Good Enough Software Practices to my reading list too

Text as data stanford https://t.co/w3AsfAZ7Fq?amp=1

Machine Learning: The High-Interest Credit Card of Technical Debt https://t.co/oVbVvWsB0p?amp=1

What’s your ML test score? A rubric for ML production systems https://research.google.com/pubs/pub45742.html

A Pragmatic Introduction to Signal Processing https://t.co/w7RwwD1sUk?amp=1

https://data.world/promptcloud/all-titles-by-techcrunch-and-venturebeat-in-2017

https://itsalocke.com/blog/working-with-pdfs---scraping-the-pass-budget/

https://cloud.withgoogle.com/build/infrastructure/then-now-google-history-urs-h%C3%B6lzle/

http://selbydavid.com/2017/12/29/r-android/

https://rud.is/rpubs/2017-year-in-review/

Dataquest online course

http://datasciencemasters.org/

List of definitive data guides http://rocketdatascience.org/?p=482

http://www.onceupondata.com/2017/12/31/stringr-explorer/

https://medium.com/mlreview/gradient-boosting-from-scratch-1e317ae4587d

Angelika DE92100500004164605710

https://tensorflow.rstudio.com/blog/word-embeddings-with-keras.html

https://python-graph-gallery.com/?platform=hootsuite

https://ryanpeek.github.io/2017-10-24-mapping-with-sf-part-1/

Daniel miessler podcast

http://www.business-science.io/code%20tools/2018/01/04/tibbletime-0-1-0.html

https://mapr.com/blog/connecting-apache-drill-power-bi-part-3/

https://azure.microsoft.com/en-us/free/free-account-faq

https://rbasics.netlify.com/

https://mapr.com/training/essentials/

Data crunch podcast

Field guide to the R ecosystem http://fg2re.sellorm.com/

http://blog.revolutionanalytics.com/2017/12/ml-server-ai-path.html

R on WSL!!! http://blog.revolutionanalytics.com/2017/12/r-in-the-windows-subsystem-for-linux.html

http://mirai-solutions.ch/news/2018/01/09/NYC-TLC-Trip-Data-Analysis-Using-Sparklyr-and-Google-BigQuery/

About scraping and tidy text!!! https://rayms.github.io/2018-01-04-election-observers/

https://dzone.com/articles/drill-data-with-apache-drill-part-2

https://mapr.com/blog/how-use-sql-hadoop-drill-rest-json-nosql-and-hbase-simple-rest-client/

http://www.treselle.com/blog/drill-data-with-apache-drill/

http://dataottam.com/2015/12/20/apache-drills-role-in-the-big-data-enterprise-data-architecture/

https://tensorflow.rstudio.com/blog/keras-customer-churn.html

Overview of Artificial Neural Networks and its Applications https://www.xenonstack.com/blog/overview-of-artificial-neural-networks-and-its-applications

http://www.rblog.uni-freiburg.de/2017/02/07/deep-learning-in-r/

http://selbydavid.com/2018/01/09/neural-network/

http://blog.revolutionanalytics.com/2017/08/text-categorization-deep-learning.html

https://aischool.microsoft.com/learning-paths

https://azure.microsoft.com/en-us/resources/six-cloud-challenges-solved/?wt.mc_id=AID627566_QSG_SCL_216009

http://blog.revolutionanalytics.com/2018/01/r-cloud-tools.html

https://www.linux.com/learn/intro-to-linux/2017/12/ipv6-auto-configuration-linux

http://colinfay.me/purrr-web-mining/

History of deep learning https://arxiv.org/abs/1702.07800

https://www.datasciencecentral.com/profiles/blogs/14-great-articles-about-cross-validation-model-fitting-and-select

https://appsilondatascience.com/blog/rstats/2018/01/16/keras.html

https://stackoverflow.com/questions/14837902/how-to-write-a-function-that-calls-a-function-that-calls-data-table?stw=2

Visualization Book http://socviz.co/

Machine learning engineering best practices https://t.co/TByWo4eiBn?amp=1

https://www.zevross.com/blog/2017/06/19/tips-and-tricks-for-working-with-images-and-figures-in-r-markdown-documents/

https://ropensci.org/blog/2018/01/16/tidyhydat/

Look at

r package: seplyr, wrapr, Rpostgres, RMariaDB, new dbplyr, flextable, usethis, reprex, strict, async, whisker, brew, glue, opencpu, geofacet, gganimate, tidypredict, wakefield (random data generation), janitor, rio, modelr, broom, purrr, Others: sql memcache, apache solr, rappdirs,

interesting blogs

https://cartesianfaith.com/about/ https://chrisalbon.com/ http://simplystatistics.org/ http://pbpython.com/ http://stat545.com https://blog.rstudio.org/ https://shiring.github.io https://github.com/rushter/data-science-blogs (a lot of blogs) https://www.reddit.com/r/datascience/ https://edwinth.github.io https://edwinth.github.io/blogs-I-read/ https://www.mailund.dk