the-magnificents / 04-02-2021-Carpentry-for-HGIS

A carpentry workshop focused on Digital Humanities audience that works with Geospatial Data.

Other

2 stars 3 forks source link

04-02-2021-Carpentry-for-HGIS/02_Day_2_Python_GIS/exercise/B7_Exercise_Loop_Datasets #76

Open utterances-bot opened 3 years ago

utterances-bot commented 3 years ago

Looping Over Data Sets — Python essentials for GIS learners

https://the-magnificents.github.io/04-02-2021-Carpentry-for-HGIS/02_Day_2_Python_GIS/exercise/B7_Exercise_Loop_Datasets.html

ThoTUM86 commented 3 years ago

The expressions to load a file like "data/gapminder_*.csv" all start with data.

To execute it on my pc I have to fill in the whole path to make it work, is that necessary or am I missing some short-cut?

cforgaci commented 3 years ago

Exercise: Determining Matches

data/gapminder_gdp_africa.csv

Exercise: Minimum File Size

import glob
import pandas as pd
fewest = 100   # a large number
for filename in glob.glob('data/data_gapminder/*.csv'):
    dataframe = pd.read_csv(filename)
    fewest = min(fewest, dataframe.shape[0])
print('smallest file has', fewest, 'records')

Result:

smallest file has 2 records

Exercise: Comparing Data The following program works, but I don't know how to label the regions in the legend

for filename in glob.glob('data/data_gapminder/gapminder_gdp_*.csv'):
    data = pd.read_csv(filename)
    data.mean().plot()
    plt.legend(loc='best')
    plt.xticks(rotation=90)

jurra commented 3 years ago

@ThoTUM86 probably is a path routing issue. If its not working you should be getting an error, telling you that the file doesnt exist. But good that you tried the absolute path, that was a quick hack, well done!!!

MertenNefs commented 3 years ago

The script of the fewest records did not work in my data folder at first, because by accident there was a file with no data in it (from the qgis exercise). When I added 'gapminder' before the * it worked. So why does it not work when there are 0 records in the file, or with other files than the gapminders?

MertenNefs commented 3 years ago

this works without knowing how to string-split the regions from the filenames, so lot less elegant than the for loop in the solution;)

africa = pd.read_csv('data/gapminder_gdp_africa.csv')
americas = pd.read_csv('data/gapminder_gdp_americas.csv')
asia = pd.read_csv('data/gapminder_gdp_asia.csv')
europe = pd.read_csv('data/gapminder_gdp_europe.csv')
oceania = pd.read_csv('data/gapminder_gdp_oceania.csv')

africa.mean().plot(label='africa')
americas.mean().plot(label='americas')
asia.mean().plot(label='asia')
europe.mean().plot(label='europe')
oceania.mean().plot(label='oceania')
plt.legend(loc='best')
plt.xticks(rotation=90)