jekwatt / idiomatic_pandas

Tips and tricks for the most common data handling task with pandas.
0 stars 0 forks source link

Combining multiple files with Pandas #4

Open jekwatt opened 3 years ago

jekwatt commented 3 years ago

There are many ways in combining multiple files with Pandas.

import glob
import os

from pathlib import Path

today = datetime.now().date()
y, m, d = today.year, today.month, today.day
md = f"{m:02d}-{d:02d}"

p = Path.cwd().parents[0]
here = p / 'data' / 'prod' / md

all_csv_files = sorted(glob(os.path.join(here, "*.csv")))
df_from_each_csv = (pd.read_csv(f) for f in all_csv_files)
df = pd.concat(df_from_each_csv, ignore_index=True)
jekwatt commented 3 years ago

https://github.com/chris1610/pbpython/blob/master/notebooks/Combining-Multiple-Excel-File-with-Pandas.ipynb

import glob

for f in glob.glob("../data/data_*.xlsx"):
    df = pd.read_excel(f)
    all_data = all_data.append(df, ignore_index=True)