Disfactory / SpotDiff

SpotDiff專案希望能讓鄉民比對 2016.5.20 前後衛星雲圖,去抓出農委會五萬筆資料中的疑似工廠位址上的建物是不是新增建物,可以集中火力去檢舉或是可以把台灣疑似工廠的地點掃一遍。
MIT License
6 stars 3 forks source link

Task 2: fetch data from the Factory table (in the Disfactory database) to the Location table #10

Open yalgorithm777 opened 2 years ago

yalgorithm777 commented 2 years ago

In this task, you need to write a python script to fetch data from the Factory table (in the Disfactory database) to the location table. The script needs to use the operation functions developed in Task 1 to put data into the location table.

The description of the Factory table is in the "/factories/{factory_id}" section in the Disfactory API page.

IMPORTANT: make sure that you read the Coding Standards section before writing code.

IMPORTANT: open separate branches and request code reviews to merge into the main branch when the subtasks are done.

Notes:

Please reply to this issue if there are questions.

Sourbiebie commented 2 years ago

We concluded to use CSV for data import yesterday. I'll write a script to dump data from CSV to location table.

Reasons:

  1. The current disfactory API randomly choose 100(max) locations when executing, which is not designed for data export.
  2. Spotdiff only needs data input for the first time from the government data.
Sourbiebie commented 2 years ago

Last week we concluded to leave the table 1-1 to disfactory/factory table, the year and url is moved to Answer table. So, should we change the Notes in the description?

@yalgorithm777 Thanks for your advice!

Sourbiebie commented 2 years ago

Done, wait for review.

Sourbiebie commented 2 years ago

https://github.com/Disfactory/SpotDiff/commit/8204c8aff62fbc816521a81511f5fdda6455323f

deeper747 commented 2 years ago

A new request here🙏🏽 I noticed there is a way to filter the locations that are more likely to be newly built spots. I anti-joined the factory data crawled respectively in 2019 and 2022, acquire a list of 9,982 spots. spots.csv We'll dump it into the db of disfactory.tw first (to appoint factory ID and display number ).

Also, I'll split the table into 1000 rows a file, hoping we could finish a file in two months.

deeper747 commented 2 years ago

The production table is available now (the "production" spreadsheet) https://docs.google.com/spreadsheets/d/10PUagSg0rgy4ycLpJKvQ24fyPk_YP0hgVYVijdlmZX0/edit?usp=sharing