Code-for-OKINAWA / covid19

沖縄県 新型コロナウイルス感染症対策サイト / Tokyo COVID-19 Task Force website
https://okinawa.stopcovid19.jp/
MIT License
29 stars 17 forks source link

CSVから自動データ取得するための基盤整備(CSVの置き場所とかそういうのをISCOと相談) #177

Closed wanwanland closed 4 years ago

amouro commented 4 years ago

やります

I'm trying to modify the tool first. So you can easily convert the csv into /data/data.json

amouro commented 4 years ago

Please bear with me to discuss this in English. Because after working on the covert tool, I have couple things for you to decide in order to format a correct data.json. Feel free to translate my question or discuss in Japanese.

What is the updated date we want to use?

image I know you keep change the lastUpdate when the data changed. In the tool it is by default set to 8:00am of the next day. Do you want to use the time when you create the data.json ? https://github.com/Code-for-OKINAWA/covid19/blob/9541c28a98d708b6bce13026ff0b8ff1c56b9bb3/data/data.json#L2

What is the raw data file format? will someone convert the xlsx?

The tool uses Xlsx in order to define the data by specify the area in Excel. ex. A2:J100 Would you mind to convert the csv to excel before running the convert script?

Does anyone have the excel file format that Tokyo uses?

Would you like to show the 軽症・中等症、重症者数?

It is displaying "-" on the website. Would you show the number from data.json? image

Would you implement #92 to include the "検査実施人数" soon?

The tool has the processor, too.

amouro commented 4 years ago

Sorry I have so many questions.

I do have one tool that process the file correct. You can try it in my branch https://github.com/amouro/covid19/tree/feature/%23177-csv-to-data

The following steps shows how does it work.

Deprecated かわました

### Step 1: convert csv to Excel
Open CSV with Excel and save it to `/tool/download/沖縄県患者発生発表数-RAW.xlsx`

### Step 2: Add a data update time to the file
This is the 更新時間 of the latest ISCO file.
The place is now in Q2, and the date was designed to a special format, in order to avoid rewrite by excel.
`#2020/04/10 18:00#`
![image](https://user-images.githubusercontent.com/3444618/78997613-de5f3300-7b81-11ea-8df1-790ab64e0ea7.png)

### Step 3: A file called 検査実施サマリ.xlsx
For the 検査実施人数 data and I assume you're going to have more data from issue #92 
Please add this file.
[検査実施サマリ.xlsx](https://github.com/Code-for-OKINAWA/covid19/files/4461938/default.xlsx)

### Step 4: Setup tool ( /tool/README.md)
`$ cd tool/`
`$ composer install`

### Step 5: Run the script
`$ php convert.php`
The file `/data/data.json` will be replaced.
amouro commented 4 years ago

The branch of the modified conver tool

https://github.com/amouro/covid19/tree/feature/%23177-csv-to-data

A sample of the data.json created by the converter

{
    "patients": {
        "date": "2020\/04\/10 19:00",
        "data": [
            {
                "確定日": "2020-02-14T08:00:00.000Z",
                "居住地": "南部保健所管内",
                "年代": "60代",
                "性別": "女性",
                "退院": "入院勧告解除",
                "備考": null,
                "date": "2020-02-14"
            },
            {
                "...": "...."
            },
            {
                "確定日": "2020-04-10T08:00:00.000Z",
                "居住地": "那覇市",
                "年代": "40代",
                "性別": "男性",
                "退院": "入院",
                "備考": null,
                "date": "2020-04-10"
            },
            {
                "確定日": "2020-04-10T08:00:00.000Z",
                "居住地": "那覇市",
                "年代": "50代",
                "性別": "男性",
                "退院": "入院",
                "備考": null,
                "date": "2020-04-10"
            }
        ]
    },
    "patients_summary": {
        "date": "2020\/04\/10 19:00",
        "data": [
            {
                "日付": "2020-02-14T08:00:00.000Z",
                "小計": 1
            },
            {
                "日付": "2020-02-15T08:00:00.000Z",
                "小計": 0
            },
            {
                "...": "...."
            },
            {
                "日付": "2020-04-10T08:00:00.000Z",
                "小計": 7
            }
        ]
    },
    "lastUpdate": "2020\/04\/10 22:59",
    "main_summary": {
        "attr": "検査実施人数",
        "value": 333,
        "children": [
            {
                "attr": "陽性患者数(県外感染者含む)",
                "value": 50,
                "children": [
                    {
                        "attr": "入院中(調整中含む)",
                        "value": 46,
                        "children": [
                            {
                                "attr": "軽症・中等症",
                                "value": 49
                            },
                            {
                                "attr": "重症",
                                "value": 0
                            }
                        ]
                    },
                    {
                        "attr": "退院",
                        "value": 4
                    },
                    {
                        "attr": "死亡",
                        "value": 0
                    }
                ]
            }
        ]
    }
}
amouro commented 4 years ago

Make the convert tool available for CSV file.

amouro commented 4 years ago

248 プルリクレビューお願いします

Feature and Changes

HOW-TO

  1. Update file tool/downloads/cases.csv and tool/downloads/summary.csv -- cases.csv - The CSV file provided by ISCO. -- summary.csv - The last data update time and 検査実施人数
  2. Commit to development
  3. DONE -- GitHub action will run Data Builder and convert data.json -- GitHub action will commit the data.json

PS

amouro commented 4 years ago

@mami-miyagi I'm not sure if you want to close this issue now before we can get CSV file from ISCO server.

amouro commented 4 years ago

自動データ更新の準備 Prep for automated data update

CSVから自動データ取得のURL

https://isc-okinawa.org/opendata/470007_okinawa_covid19_patients.csv

スケジュール schedule

https://help.github.com/en/actions/reference/events-that-trigger-workflows#scheduled-events-schedule

フェイルダンロード File Download

https://github.com/marketplace/actions/download-file-to-workspace

amouro commented 4 years ago

@m3n3m 退院状況に「入院調整中」追加(県庁公表資料で確認) 患者_退院済フラグ データないのは入院調整中ですか? 確認お願いします。

m3n3m commented 4 years ago

@amouro 患者_退院済フラグ データないところは「入院調整中」です。

amouro commented 4 years ago

@m3n3m I've submit a PR to match the rule. #283

amouro commented 4 years ago

In #283 PR I also made the Github Action download the CSV

HOW-TO Update ( after merged)

edited on 4/15

When?

When Ayane Ish notify us that ISCO has update the CSV**

Steps

WHAT does the converter do?

  1. Download the csv
  2. Convert to JSON
  3. Commit the update cases.csv and data.json

Need Discussion

I haven't add scheduled feature. Because I am not sure if we want to update it full automatic. If yes, we can discuss when to update it.

  1. 全自動にしたいですか?
  2. 間隔または更新タイミングは何ですか?
amouro commented 4 years ago

Update

amouro commented 4 years ago

Update data builder for the latest CSV change.

Discharge status

    0: "入院",
    1: "退院",
    2: "入院調整中"
    NULL: "確認中"

===========

居住地・年代・性別

NULL: 調査中

image

居住地・年代・性別

Result sample

Screen Shot on 2020-04-16 at 22:29:01

amouro commented 4 years ago

追加患者状態

Map column 患者_状態 into data.json.

image

Reference "患者の状態を次のいずれかの文字列で記載。(不明やその他公開できない場合は空欄とする) {無症状, 軽症, 中等症, 重症, 死亡}"

amouro commented 4 years ago

レビューお願いします

I would like to merge the current feature to development if the converter is good.

And let's discuss if we want to schedule it and run it automatically in specific time.

amouro commented 4 years ago

Converter deployed. @masakifujie please consider close this issue. Thanks