UBC-MDS / DSCI_522_Alberta-Oil-Spills

1 stars 2 forks source link

To-do for the week of Nov 24th #8

Closed huijuechen closed 5 years ago

huijuechen commented 5 years ago

For the first script of cleaning data, right now I can think of three things to do:

  1. Remove all the rows with null/empty values (mostly in the columns of substance, volume, volume_units)

  2. Update volume and volume unit into consistent number format

  3. Put all the dates in the ISO format

TBD: if the "unknown" should be removed from the source column.

alyciakb commented 5 years ago

Script 1, data cleaning complete (by Juno).

alyciakb commented 5 years ago

Next up on the to-do list (for lab today):

huijuechen commented 5 years ago

Script 3, model building.

----TBD: what kind of summary to output to CSV

alyciakb commented 5 years ago

Script 2 is complete and working from the command line. I still need to clean up the file, which I will finish later tonight and then send a pull request.

huijuechen commented 5 years ago

Script 3, model building is done

Output:

huijuechen commented 5 years ago

Script 4, model visualization is done

Output:

alyciakb commented 5 years ago

TO DO:

huijuechen commented 5 years ago

For the final report:

huijuechen commented 5 years ago

For converting the pdf of decision tree to png, I tested the command line code below and it worked:

sips -s format png results/oil_spills_model.pdf --out results/oil_spills_model.png

I added above to the end of the README.md command line arguments.

Now the png "results/oil_spills_model.png" is ready to use in the final report.

huijuechen commented 5 years ago

Done.

huijuechen commented 5 years ago

Added the run_all.sh file and the command line for running it to README.