isdsucph / isds2021

Introduction to Social Data Science 2021 - a summer school course https://isdsucph.github.io/isds2021/
MIT License
22 stars 37 forks source link

Trouble with Ex 0.4.2 #1

Open MieH-Dk opened 3 years ago

MieH-Dk commented 3 years ago

Hi! I'm having a hard time figuring out why the following code doesn't change the name of the first four columns of 'df_weather'. I hope you can help!

df_weather.iloc[:,:4].columns=['station', 'datetime', 'obs_type', 'obs_value']

joachimkrasmussen commented 3 years ago

Hi Mie,

You can try and follow an approach in two steps: (i) create a new dataframe where you select the four left-most columns and give it a new name, and then (ii) use the same approach as suggested in this answer on 'stackoverflow' to rename the columns: https://stackoverflow.com/questions/11346283/renaming-column-names-in-pandas

Best, Joachim

Note: In order for the assert statements to work, you should end up with a dataframe that is also called df_weather. I am only suggesting that you create a new dataframe in order to clearly seperate the steps in the suggested approach. A solution is to also call the "new" dataframe in step (i) df_weather.

MieH-Dk commented 3 years ago

I solved the task using another approach, but I was still wondering why my original code line (from my original question) didn't do the job. That was my question. Thanks :)

johankll commented 3 years ago

Hi Joachim,

I followed your suggested approach (see code and output below). Now am I wondering if it is a mistake that we are not asked to save the new dataframe with a certain name? Will the hidden assert command(s) work regardless of the name of the new dataframe?

image

joachimkrasmussen commented 3 years ago

Hi Johan,

Thank you for being careful here. No, you are right, you should not give your new dataframe a different name.

The reason why I suggested using a different name was simply to seperate the two steps clearly. However, for the purposes of providing answers to this assignment, there will obviously be a problem with the assert statement if you end up with a dataframe that has another name. Thank you for making me aware of this unfortunate implication of my suggested approach!

I have added a note above that hopefully makes it clear.

Best, Joachim

johankll commented 3 years ago

Hi Joachim,

Thank you for the answer. Would the below code satisfy the assert statement?

Also: does the order with which the cells are run ensure that overwrites (such as the ones in the code below) do not cause problems with the assert statements? If not, this may cause problems with the assert commands for Ex 0.4.1?

image

joachimkrasmussen commented 3 years ago

Hi Johan,

I don't want to directly verify whether the above solution is "correct". However, there is nothing wrong with the sequential order of your steps, and you end up with a dataframe that is called df_weather which is what you are supposed to.

As I mentioned in another thread: If you have written your code properly, you should be able to click kernel --> restart and run all and then all your answers will be produced in the right order. However, be aware that your outputs will be deleted temporarily while all the cells are being run from top to bottom.

Best, Joachim