Open caesar0301 opened 9 years ago
Added! A Pull Request is encouraged to record your kindly contribution. :+1:
Are you interested in add some links from Argentina's government?
Can you make a Pull Request about your data?
Of course. Today at night i'll send it.
2014-12-12 12:41 GMT-03:00 Xiaming notifications@github.com:
Can you make a Pull Request about your data?
— Reply to this email directly or view it on GitHub https://github.com/caesar0301/awesome-public-datasets/issues/1#issuecomment-66788902 .
Nice source. Added under Sport. :+1:
Pandas Remote Data DataFrame API wrappers: http://pandas.pydata.org/pandas-docs/dev/remote_data.html
- Yahoo! Finance
- Google Finance
- St. Louis FED (FRED)
- Kenneth French’s data library
- World Bank
Transcriptions of all debates in the German government as txt files: http://www.bundestag.de/plenarprotokolle
U.S. Department of Education:
Hi, would you be able to add LG Inform to your awesume-public-datasets. It holds publically available data about local authorities and fire and rescue services in England - http://lginform.local.gov.uk/search Thanks Alex
@rtbarber NCES added! LGInform added!
Great thanks, have a good day
Kind Regards
Alex
From: Xiaming [mailto:notifications@github.com] Sent: 20 April 2015 09:10 To: caesar0301/awesome-public-datasets Cc: Alexandra Marshall Subject: Re: [awesome-public-datasets] Requests for new public datasets contributions. (#1)
@rtbarberhttps://github.com/rtbarber NCES added! LGInform added!
— Reply to this email directly or view it on GitHubhttps://github.com/caesar0301/awesome-public-datasets/issues/1#issuecomment-94388374.
This email may include confidential information and is solely for use by the intended recipient(s). If you have received this email in error please notify the sender immediately. You must not disclose, copy, distribute or retain any part of the email message or attachments. No responsibility will be assumed by the LGA for any direct or consequential loss, financial or otherwise, damage or inconvenience, or any other obligation or liability incurred by readers relying on information contained in this email. Views and opinions expressed by the author are not necessarily those of the organisation nor should they be treated, where cited, as an authoritative statement of the law, and independent legal and other professional advice should be obtained as appropriate.
Visit the Local Government Association website – www.local.gov.uk
Some additional biology-related public datasets worth considering:
ExAC - http://exac.broadinstitute.org/ (exome sequencing data for 60,706 unrelated individuals, including 1000 genomes) OMIM - http://www.omim.org/ (database of phenotype-genotype relationships) dbSNP - http://www.ncbi.nlm.nih.gov/SNP/ (database of phenotype-genotype relationships) dbGAP - http://www.ncbi.nlm.nih.gov/gap (database of phenotype-genotype relationships)
A French flora recognition system : http://identify.plantnet-project.org/en/
@PanArnaud Where is the public dataset on this page?
It's a search engine. That may be not appropriate ... http://identify.plantnet-project.org/en/base/tree
That's not a dataset. You can't download it as a CSV (for example) or access it via public API.
I understand. Sorry for the inconvenience
Obvious internet stuff: http://thecatapi.com
Belgium also has open data: http://data.gov.be/
The Macaulay Library: archive of wildlife sounds and videos http://macaulaylibrary.org/
@cofiem It seems that these data are not free?
Partially free for some datasets.
@caesar0301 Unfortunately yes, you're right, the data are not free nor in a machine readable form as far as I can see :disappointed:
I found this collection of datasets of (Context-Aware) Recommender Systems. http://students.depaul.edu/~yzheng8/DataSets.html
Maybe its a good idea to talk to the author before publish it.
I have reached the author to grant permission. He said Yes. I will merge this cat into list manually.
Thanks for the detailed list of many awesome datasets! few missing good data source from biology side: GTEx http://www.gtexportal.org/ ESP(Exome Sequencing Project) https://esp.gs.washington.edu/drupal/ ExAC(Exome Aggregation Consortium) http://exac.broadinstitute.org/ UK10K http://www.uk10k.org/
I see you have the Internet Archive's ArchiveIt! service listed as a search engine, it's really a self-serve web archiver.
Other Internet Archive datasets: https://openlibrary.org/developers/dumps -- metadata for books
Integrated Marine Observing System (IMOS) - roughly 30TB of ocean measurements: https://imos.aodn.org.au
Or directly the on the S3 bucket: http://imos-data.s3-website-ap-southeast-2.amazonaws.com/
There is a nice quora topic about it where you could find other sources as well.
Hi there! I have a data about Japanese kanji usage frequency, also available as a user-friendly page. Does it satisfy the requirements?
Storage block traces (OSI licensed).
@wumpus Thanks for ur suggestion. The archives may fit the PublicDomains category. The OL Dump also added.
@ danfruehauf IMOS added! :+1:
Hello,
An international economic database is being built here : http://widukind.cepremap.org/ and all the source code (with python, R client) is available here : https://github.com/Widukind.
Thanks!
Aviation weather data: https://aviationweather.gov/adds/dataserver
Added!
The Personal Genome Project (http://www.personalgenomes.org/ and https://my.pgp-hms.org/public_genetic_data) 1000 Genomes (http://www.1000genomes.org/ and http://www.1000genomes.org/data) UCSC Public Data (http://hgdownload.soe.ucsc.edu/downloads.html)