JungHulk / Hulk-Engineering

1 stars 0 forks source link

EDA #13

Open JungHulk opened 4 months ago

JungHulk commented 4 months ago

일자/시간대별 자전거 대여량을 예측 import pandas as pd submission=pd.read_csv("./data/sampleSubmission.csv") submission

import pandas as pd submission=pd.read_csv("./data/sampleSubmission.csv") submission

. Data 불러오기 (train.csv) 1) 먼저, DataField 확인 이해 [년월일시 - datetime관련] datetime - hourly date + timestamp (년-월-일-시-분-초 형태)

[구분자(category)관련] season – 1 = spring, 2 = summer, 3 = fall, 4 = winter ==> 1-3월, 4-6월, 7-9월, 10-12월임. 즉 1사분기, 2사분기, 3사분기, 4사분기임

holiday – whether the day is considered a holiday (공휴일이면 1, 아니면 0)

workingday - whether the day is neither a weekend nor holiday (근무일이면 1, 아니면 0)

[날씨, 기후 관련] weather (날씨)

1: Clear, Few clouds, Partly cloudy, Partly cloudy

2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist

3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds

4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog

temp - temperature in Celsius (온도)

atemp - "feels like" temperature in Celsius (체감온도)

humidity - relative humidity (습도)

windspeed - wind speed (풍속)

[자전거 대여량] casual - number of non-registered user rentals initiated (비회원의 대여량)

registered - number of registered user rentals initiated (회원의 대여량)

count - number of total rentals (총 대여량)