onefact / datathinking.org

Data Thinking website deployed using GitHub Pages
https://datathinking.org
Apache License 2.0
7 stars 7 forks source link

[website] create datasets page on datathinking.org/datasets #23

Open jaanli opened 1 year ago

jaanli commented 1 year ago

via @kulnor :

For anonymization:

https://centre.humdata.org/learning-path/disclosure-risk-assessment-overview/statistical-disclosure-control-tutorial/ https://sdctheory.readthedocs.io/en/latest/

jaanli commented 1 year ago

https://lacunafund.org/datasets/

jaanli commented 1 year ago

https://nexus.openlca.org/search foodb.ca

jaanli commented 1 year ago

with @siimre -- datasets on driving / accidents by language (look at word error rate and driving accidents and availability of speech-to-text in cultures / languages)

jaanli commented 1 year ago

with @siimre -- correlate word error rate of machine translation software to scams and phishing attack success - e.g. https://microdata.worldbank.org/index.php/catalog?page=1&sk=laundering&ps=15

jaanli commented 1 year ago

https://urban.tech.cornell.edu/what-were-doing/

jaanli commented 1 year ago

https://twitter.com/steinly0/status/1629512550323740675?s=20

jaanli commented 1 year ago

Maybe use react flow for diagram of these; the links just link to # headings on the same page

Chatgpt can help.

Make each page become a separate slide with slides.js

jaanli commented 1 year ago

https://www.linkedin.com/posts/mengyaowang11_datascience-dataanalysts-machinelearning-activity-7035470197410975744-IXil

jaanli commented 1 year ago

indrek data &

https://www.fuse-capital.com/ 2023 M&A Report 2023 and beyond 7.Are you planning to go through an M&A process in the next 12 months? Yes No Maybe 8.What are your main drivers for conducting an acqusition? Choose as many as you like EBIT(DA) Growth Acquiring key customer(s) Improving your talent pool Improving your IP Not applicable 9.How do you plan to fund the deal? Choose as many as you like Equity Private/Venture Debt fund Existing investors Bank Debt 10.What are the major obstacles to completion Choose as many as you like Access to capital Internal resourcing / team bandwidth Differences in valuation expectations Time constraints 11.How confident are you in: Very confident Confident Somewhat confident Somewhat not confident Extremely not confident Completing the deal Ability of your advisor 12.How confident are you in: Very confident Confident Somewhat confident Somewhat not confident Extremely not confident The equity markets The debt markets 13.What is your key prediction for the M&A market in 2023?

jaanli commented 1 year ago

https://www.palladiummag.com/2023/02/23/the-west-lives-on-in-the-talibans-afghanistan/

And Wang huning

And J&J.

And OSINT guide to mental health.

And ZeroGPT

jaanli commented 1 year ago

https://climateatlas.ca/map/canada/plus30_2030_85#lat=48.95&lng=-79.67

jaanli commented 1 year ago

https://foodb.ca/downloads

jaanli commented 1 year ago

land use from columbia, nyu, and the church in new york city - maybe from https://capitalplanning.nyc.gov/map/facilities#11.72/40.6506/-73.9607 or https://github.com/NYCPlanning/db-developments

jaanli commented 1 year ago

image https://nepis.epa.gov/Exe/ZyNET.exe/P100JPPH.txt?ZyActionD=ZyDocument&Client=EPA&Index=2011%20Thru%202015&Docs=&Query=&Time=&EndTime=&SearchMethod=1&TocRestrict=n&Toc=&TocEntry=&QField=&QFieldYear=&QFieldMonth=&QFieldDay=&UseQField=&IntQFieldOp=0&ExtQFieldOp=0&XmlQuery=&File=D%3A%5CZYFILES%5CINDEX%20DATA%5C11THRU15%5CTXT%5C00000011%5CP100JPPH.txt&User=ANONYMOUS&Password=anonymous&SortMethod=h%7C-&MaximumDocuments=1&FuzzyDegree=0&ImageQuality=r75g8/r75g8/x150y150g16/i425&Display=hpfr&DefSeekPage=x&SearchBack=ZyActionL&Back=ZyActionS&BackDesc=Results%20page&MaximumPages=1&ZyEntry=2

https://tnmt.com/infographics/carbon-emissions-by-transport-type/ image

via will!

jaanli commented 1 year ago

Make map in react flow for https://www.linkedin.com/posts/mengyaowang11_dataanalysis-dataengineering-datascience-activity-7039573037649731584-Sr7R

jaanli commented 1 year ago

https://tech.marksblogg.com/duckdb-geospatial-gis.html

https://www.architecture-performance.fr/ap_blog/trying-duckdb-with-discogs-data/

https://ibis-project.org/backends/DuckDB/#the-flexibility-of-python-analytics-with-the-scale-and-performance-of-modern-sql https://news.ycombinator.com/item?id=29010103 https://pacha.dev/blog/2021/08/27/comparing-sqlite-duckdb-and-arrow-with-un-trade-data/

jaanli commented 1 year ago

https://www.together.xyz/blog/openchatkit

jaanli commented 1 year ago

nutritionx claims

jaanli commented 1 year ago

hotel costs to replicate https://twitter.com/levelsio/status/1634617450041057280 worldwide (with interest + yields estimated from high-yield accounts)

jaanli commented 1 year ago

https://www.reddit.com/r/datasets/comments/w9uypl/financial_datasets_for_long_term_analysis_and/

jaanli commented 1 year ago

https://twitter.com/mattdeitke/status/1638608472525897728

jaanli commented 1 year ago

Arcgis race and ethinicity in the US

salt per country by @mwagner

jaanli commented 1 year ago

Brick concrete rain exterior -- data from India supply chain, and time spent on maintenance, or wood homes. Brick and concrete

jaanli commented 1 year ago

for iris: datasets related to technological advancement? job loss, economics, etc