Disfactory / SpotDiff

SpotDiff專案希望能讓鄉民比對 2016.5.20 前後衛星雲圖,去抓出農委會五萬筆資料中的疑似工廠位址上的建物是不是新增建物,可以集中火力去檢舉或是可以把台灣疑似工廠的地點掃一遍。
MIT License
6 stars 3 forks source link

[Data Model] Decide whether to clone or reference factory/location data from disfactory project #8

Closed Sourbiebie closed 2 years ago

Sourbiebie commented 3 years ago

Is it possible to have the table header of factory/location data from the disfactory project to proceed the discussion? (And if possible, some data for reference)

https://docs.google.com/presentation/d/1hyak0PdXA3CpxSC82T7ahmp0AbqjEJS2c3AcnhURpnw/edit#slide=id.gf5e3502c4a_2_0

[LittleWhiteYA]

  1. 我稍微看了一下,覺得如果 city_name, town_name, lat 和 lng 都是從 factory 來的話 要不要乾脆直接就拿掉 在 query location 的時候直接 join factory table 然後再下 where 做 filter 就好 這樣可能可以避免未來 location 和 factory 資料不一致的情形發生

2.這邊會想要問說想要分開成不同 database 的原因~? 因為 location 和 factory 的關聯性看起來蠻強的,我上面的想法在建立在會放在同個 database 的基礎上

[YAlgorithm] 還沒決定 我還在用基礎的MVC架構

[ael] 可以先確認 DB 真的要分開嗎?才能談要怎麼寫 script dump 資料

[酸酸的] 我想說先瞭解一下disfactory的DB table 如何設計的,再來討論一下是否真的需要分開

[swind] 可以請 @deeper 將 factory 的 table export 成 csv 給你,這樣上面應該就會有欄位資料,以及所有工廠的資訊了。 但是不會有其他的資訊,例如公文資訊等

Sourbiebie commented 3 years ago

Got CSV and API document from deeper.

aelcenganda commented 3 years ago

The script only needs to scrap factory_id to location table

See the Open API here https://api.disfactory.tw/swagger/

Sourbiebie commented 3 years ago

According to deeper... In the disfactory/factory table, a record identified by factory_id has indicated a factory by different "snapshot" from time to time. That is, the same factory (of the same landcode) may have different records(identified by factory_id) because the lat/lon differs (e.g. expanding) reported at different time. Therefore, I'm wondering if the relationship between spotdiff/location and disfactory/factory 1-1? @aelcenganda

(QA in Slack) [disfactory/factory] 請問每個factory_id 代表的項目, 可能是同一"地號"在不同時間, lat/lon 不一樣的狀況嗎, 例如A地號的工廠可能因為擴建, 而在factory DB內有兩筆資料(分屬不同factory_id) 而記錄了不一樣的lat/lon 是嗎? (所以factory_id 是factory table 的primary key嗎?) 又, 想請問created_time 代表的是否是該筆資料的建立的時間?謝謝~

deeper 下午 2:30 第一個「?」:是 第二個「?」:是 第三個「?」:是 2:30 啊 所以是以上皆是XD

aelcenganda commented 3 years ago
  1. The latest data model is n:n between SpotDiff/location and disfactory/factory in the Google Slides https://docs.google.com/presentation/d/1hyak0PdXA3CpxSC82T7ahmp0AbqjEJS2c3AcnhURpnw/edit#slide=id.gf5e3502c4a_2_0

To answer:

請問每個factory_id 代表的項目, 可能是同一"地號"在不同時間, lat/lon 不一樣的狀況嗎, 例如A地號的工廠可能因為擴建, 而在factory DB內有兩筆資料(分屬不同factory_id) 而記錄了不一樣的lat/lon 是嗎? (所以factory_id 是factory table 的primary key嗎?) 又, 想請問created_time 代表的是否是該筆資料的建立的時間?謝謝~

  1. Yes, factory_id is the primary key of factory table
  2. factory_id represents one factory with the lat/long from user reports or from landcode(地段和地號) given by the government. We mark the center of 地號 from the government's website to be where, in lat/lon, to display a factory pin of that record. We also convert the coordinates into townname and landcode(地段和地號), a string, so that we can write the landcode on the official reports to the local government.

Government system: use landcode as identifier Disfactory system: use coordinates as identifier for visual display (WGS84 system), factory_id as primary key in factory table.