Closed joelostblom closed 1 year ago
read_csv
(and to_csv
) documentation and see if there are any other useful arguments to discuss (e.g. relating to indices)@joelostblom after a brief skim, ibis
looks super neat. I am kind of tempted to switch to that, given some more investigation. And maybe mention to students the option to send raw SQL to the DB via pd.read_sql
in a note box or something like that.
I will look at scrapy
vs beautifulsoup
shortly
Yeah, ibis really looks impressive. My hesitation is that I don't know anyone who uses it, so I don't have good insight into corner cases or real life experience/feedback.
I just played a bit with ibis
now. It's way easier to use and more natural than sqlalchemy
. I would be worried if we were doing advanced stuff, but since our course just does very simple select/filter/execute, I am going to switch us over.
Thanks for the suggestion!
I also am commenting out the web scraping and API stuff for this round, since we have more important things to handle for Jan. Issue opened to reintroduce it later #64
look through read_csv (and to_csv) documentation and see if there are any other useful arguments to discuss (e.g. relating to indices)
Just adding to this, the ones I used the most often that we have not covered are skipinitialspace
and parse_dates
. I think chunksize
could be useful too. Having that said, I am unsure if they fit in this intro chapter (and maybe not at all in the book), or could maybe be part of the data cleaning chapter (at least the first two)?
inplace
withrename
(or any pandas function, it is discouraged and will be deprecated)col_names
but we only ever userename
read_table
autoload
andautoload_with
pd.read_sql
? For example (more examples here):