creativecommons / quantifying

quantify the size and diversity of the commons--the collection of works that are openly licensed or in the public domain
MIT License
22 stars 30 forks source link

Flickr api #20

Closed SusannYY closed 1 year ago

SusannYY commented 1 year ago

Fixes

  1. Optimized data pull steps in photo_detail.py
  2. Added data_clean.py to drop empty column and duplicate rows and drop useless data columns. Only location, dates, license, tags, views, comments are included in the cleaned data (example: cleaned_hs.csv)
  3. Added datasets for three licenses in dataset file
  4. Added data_analysis.py which included some basic analysis - cleaning tags column and generating a word cloud for tags of photos
  5. Changed format of comment
  6. Added yield in photo_detail.py
  7. Deleted pyautogui package
  8. Merged main

Description

This includes the script of flickr api call and scripts for data cleaning and some basic data analyzing. Further analyzing still needed and will be finished in the next following weeks. Current Dataset isn't the whole one, and hasn't included license 4 5 6 9 10 yet, but will be done pretty soon since the script is a final one and can be run from my end.

Checklist

- [x] My pull request has a descriptive title (not a vague title like `Update index.md`). - [x] My pull request targets the *default* branch of the repository (`main` or `master`). - [x] My commit messages follow [best practices][best_practices]. - [x] My code follows the established code style of the repository. - [ ] I added or updated tests for the changes I made (if applicable). - [ ] I added or updated documentation (if applicable). - [x] I tried running the project locally and verified that there are no visible errors. [best_practices]:https://gist.github.com/robertpainsi/b632364184e70900af4ab688decf6f53 ## Developer Certificate of Origin

For the purposes of this DCO, "license" is equivalent to "license or public domain dedication," and "open source license" is equivalent to "open content license or public domain dedication."

Developer Certificate of Origin ``` Developer Certificate of Origin Version 1.1 Copyright (C) 2004, 2006 The Linux Foundation and its contributors. 1 Letterman Drive Suite D4700 San Francisco, CA, 94129 Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Developer's Certificate of Origin 1.1 By making a contribution to this project, I certify that: (a) The contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file; or (b) The contribution is based upon previous work that, to the best of my knowledge, is covered under an appropriate open source license and I have the right under that license to submit that work with modifications, whether created in whole or in part by me, under the same open source license (unless I am permitted to submit under a different license), as indicated in the file; or (c) The contribution was provided directly to me by some other person who certified (a), (b) or (c) and I have not modified it. (d) I understand and agree that this project and the contribution are public and that a record of the contribution (including all personal information I submit with it, including my sign-off) is maintained indefinitely and may be redistributed consistent with this project or the open source license(s) involved. ```