data-8 / datascience

A Python library for introductory data science
https://www.data8.org/datascience/
BSD 3-Clause "New" or "Revised" License
626 stars 295 forks source link

Warn about scatter(colors=...) #386

Closed davidwagner closed 4 years ago

davidwagner commented 5 years ago

Once we've released the version with supports for scatter(group=) and have converted assignments and textbook (PR #384), let's warn on use of scatter(color=). See https://github.com/data-8/datascience/pull/384#discussion_r296495737.

adnanhemani commented 5 years ago

I've seen some commits to the Textbook regarding the API change. Have we done this to the assignments yet?

davidwagner commented 5 years ago

Textbook change is done. Haven't made the changes to assignments yet.

davidwagner commented 5 years ago

Both changes are now done.

davidwagner commented 5 years ago

I propose that the next step is to create a pull request that deprecates scatter(colors=) and warns on use of it, but still supports that functionality for now (so we don't break older code).

adityakuppa26 commented 4 years ago

@adnanhemani @davidwagner Fixed the issue in https://github.com/data-8/datascience/pull/451

adnanhemani commented 4 years ago

Closed as per #389 .

mycarta commented 4 years ago

Hi there FYI colors is still used in the notebooks for the edX Data8.1x course, for example in the code cell below, from lab 05 :

from functools import lru_cache as cache
# This cache annotation makes sure that if the same year
# is passed as an argument twice, the work of computing
# the result is only carried out once and then saved.
@cache(None)
def stats_relabeled(year):
    """Relabeled and cached version of stats_for_year."""
    return stats_for_year(year).relabeled(2, 'Children per woman').relabeled(3, 'Child deaths per 1000 born')
def fertility_vs_child_mortality(year):
    """Draw a color scatter diagram comparing child mortality and fertility."""
    with_region = stats_relabeled(year).join('geo', countries.select('country', 'world_6region'), 'country')
    with_region.scatter(2, 3, sizes=1, colors=4, s=500)
    plots.xlim(0,10)
    plots.ylim(-50, 500)
    plots.title(year)
fertility_vs_child_mortality(1960)

I just completed that class and downloaded the lab notebooks for further play running them locally on my machine. For that purpose I created a virtual env with the latest version of the datascience library (updated today to be sure).

When I run that cell, I get an error (which I was not getting when worked on the lab online). By the way, it is an actual error, not just a warning: Screen Shot 2020-11-10 at 1 32 09 PM

I fixed the code in the cell by replacing this line: with_region.scatter(2, 3, sizes=1, colors=4, s=500) with this one: with_region.scatter(2, 3, sizes=1, group=4, s=500)

I posted this also on the lab discussion board.