Triamus / play

play repo for experiments (mainly with git)
1 stars 0 forks source link

data discovery #18

Open Triamus opened 6 years ago

Triamus commented 6 years ago

ideas

docs

Data profiling report with pandas https://github.com/pandas-profiling/pandas-profiling

Automating Exploratory Data Analysis for use in Predictive Modeling (Classification) https://edoc.hu-berlin.de/bitstream/handle/18452/14945/churakova.pdf?sequence=1&isAllowed=y

http://www.jenunderwood.com/2017/02/28/automating-analytics-tellius/ http://www.jenunderwood.com/2017/08/01/tellius-smart-data-discovery/ https://www.udemy.com/automating-data-exploration-with-r/ https://www.kaggle.com/xanderhorn/automated-exploratory-data-analysis-notebook https://ujjwalkarn.me/2016/06/17/introducing-xda-r-package-for-exploratory-data-analysis/ https://github.com/topics/exploratory-data-analysis https://github.com/jadianes/data-science-your-way http://daslab.seas.harvard.edu/projects/queriosity/assets/doc/queriosity_vision_paper.pdf https://github.com/boxuancui/DataExplorer https://www.channels.elastacloud.com/channels/autoexplorer/autoexplorer-an-automated-data-exploration-r-package https://conferences.oreilly.com/strata/strata-eu-2017/public/schedule/detail/57647 https://github.com/MITHaystack/scikit-discovery https://emcien.com/free-data-discovery-whitepaper/ https://coriniumintelligence.com/governed-data-discovery-best-practices-whitepaper/ https://github.com/NathanEpstein/Dora https://github.com/chanduPydev/EDA https://github.com/elastacloud/automatic-data-explorer DATA VISUALIZATION IN EXPLORATORY DATA ANALYSIS: AN OVERVIEW OF METHODS AND TECHNOLOGIES https://uta-ir.tdl.org/uta-ir/bitstream/handle/10106/25475/MAO-THESIS-2015.pdf?sequence=1 Automating Exploratory Data Analysis for Efficient Data Mining http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.18.2542&rep=rep1&type=pdf Issues in Automating Exploratory Data Analysis http://w3.sista.arizona.edu/~cohen/Publications/papers/Amant-1995.pdf Solving the Source Data Problem With Automated Data Profiling http://static.progressivemediagroup.com/uploads/whitepaper/31/34fcaf18-a5de-44f5-9ddb-d8493dcaf056.pdf Data Profiling with R slides https://www.slideshare.net/michellekolbe/data-profiling-with-r AUTOMATIC DATA PROFILING WITH R http://www.dyna-bi.com/about/about/technology/automatic-data-profiling-with-r/ Data as a Service: A Framework for Providing Reusable Enterprise Data Services https://www.amazon.com/Data-Service-Framework-Providing-Enterprise/dp/1119046580/ref=sr_1_1?ie=UTF8&qid=1513616531&sr=8-1&keywords=Data+as+a+Service%3A+A+Framework+for+Providing+Reusable+Enterprise+Data+Services Kylo: Automatic Data Profiling and Search-Based Data Discovery https://dzone.com/articles/kylo-automatic-data-profiling-and-search-based-dat Clean Data Profiling https://www.symantec.com/content/dam/symantec/docs/security-center/white-papers/security-response-clean-data-profiling-08-en.pdf Data Profiling with Metanome https://hpi.de/fileadmin/user_upload/fachgebiete/naumann/publications/2015/p2092-papenbrock.pdf https://hpi.de/naumann/projects/data-profiling-and-analytics/metanome-data-profiling.html Three-Dimensional Analysis - Data Profiling Techniques https://www.amazon.com/Three-Dimensional-Analysis-Data-Profiling-Techniques/dp/0980083303/ref=sr_1_1?ie=UTF8&qid=1513617026&sr=8-1&keywords=data+profiling Data Profiling Best Practices - Pitney Bowes Whitepaper http://www.pbinsight.co.in/files/resource-library/resource-files/Data-Profiling-Best-Practices.pdf Data Cleaning: Problems and Current Approaches http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.98.8661&rep=rep1&type=pdf Data Profiling in Python http://maultech.com/chrislott/blog/20141212_data_profile_scripts.html Exploratory Data Analysis of Craft Beers: Data Profiling with Python https://www.datacamp.com/community/tutorials/python-data-profiling TPOT: A Python tool for automating data science http://www.randalolson.com/2016/05/08/tpot-a-python-tool-for-automating-data-science/ pandas-profiling 1.4.0 https://pypi.python.org/pypi/pandas-profiling/1.4.0 Making data cleaning simple with the Sparkling.data library https://medium.com/ibm-data-science-experience/making-data-cleaning-simple-with-the-sparkling-data-library-3f865ad5f8b4 How to Automate your Data Cleanup with Python http://2016.pyconuk.org/talks/how-to-automate-your-data-cleanup-with-python/ Scalable And Incremental Data Profiling With Spark (Trifacta) https://www.slideshare.net/JenAman/scalable-and-incremental-data-profiling-with-spark Kylo intro https://kylo.io/

powerBI

https://docs.microsoft.com/en-us/power-bi/desktop-get-the-desktop https://dzone.com/articles/10-r-powered-visualizations-for-power-bi-dashboard https://powerbi.microsoft.com/en-us/blog/data-cleansing-with-r-in-power-bi/ https://dzone.com/articles/r-with-powerbi-a-step-by-step-guide https://www.blue-granite.com/tutorials/power-bi-and-r