spatial-data-discovery / sdd-2021

Fall 2021 semester repository.
0 stars 1 forks source link

Sandbox 4 | Checking the ASCII Format #7

Open dt-woods opened 3 years ago

dt-woods commented 3 years ago

The Challenge

Write a script that reads all the raster files in the sandbox folder (e.g., .txt, .asc, and no extension) and checks to make certain the number of rows and number of columns header values (NROWS and NCOLS) matches the dataset. Consider also checking that the datasets are organized in rows with space-separated columns. You may also want to check that each value is numeric (you cannot have alpha or special characters as raster values).

Save your script to the sandbox and name it with your username (e.g., sandbox/sb4_dt-woods.py).

dt-woods commented 3 years ago

Need some help getting started? Check out these packages for reading in text-based files.

numpy.loadtxt

# Example loadtxt method
a = numpy.loadtxt("file.txt", delimiter=" ", skiprows=6)

numpy.genfromtxt

# Example genfromtxt method
b = numpy.genfromtxt("file.txt", delimiter = " ", skip_header=6)

pandas.read_table

# Example read_table method
c = pandas.read_table(my_file, delimiter=" ", header=None, skiprows=6)
dt-woods commented 3 years ago

There are two great examples started for sandbox 2. Based on today's discussion, it sounds like dictionaries are a good way to keep track of headers, because as you read each one, you can check to see if it's already a key (if row.split(" ")[0] not in d.keys()) and if not, assign it as a key and use the value for the value! Keep up the good work!