vinostroud / PDI_Bytes

0 stars 0 forks source link

Bite_5 'Lambda' #1

Closed vinostroud closed 1 year ago

vinostroud commented 1 year ago

Erik: I am trying to figure out why the code in 'Bite5_ParseNames_Lambda' is not accepted by PDI. Am I supposed to remove dups, and then sort by last name and then by first name length? The print commands which I commented out were to check the code I submitted. Visual inspection makes it seem like it's correct but the pybites test are rejecting it :(

Either way this was good practice for lambda functions, which much like list comprehensions I feel like I get but usually fail to execute successfully.

JnyJny commented 1 year ago

Overall

https://github.com/vinostroud/PDI_Bytes/blob/0ce686f9016cc90c0c98e7668742101a4b793928/Bite5_ParseNames_Lambda.py#L1-L25

Bites can be tricky and you have a good start. The dedup_and_title_case_names function is a "data normalization" operation and should be called by sort_by_surname_desc and short_first_name. This has to be deduced from looking at the tests:

def test_sort_by_surname_desc():
    names = sort_by_surname_desc(NAMES)
    assert names[0] == 'Julian Sequeira'
    assert names[-1] == 'Alec Baldwin'

def test_shortest_first_name():
    assert shortest_first_name(NAMES) == 'Al'

All the lines starting with assert are testing the output produced by your functions and you can see the values being tested against are all title cased. Sometimes Bites need a little bit of sleuthing in the tests to figure them out. The good news is that tests (generally) don't lie :)

Small Bugs in dedup_and_title_case_names

https://github.com/vinostroud/PDI_Bytes/blob/0ce686f9016cc90c0c98e7668742101a4b793928/Bite5_ParseNames_Lambda.py#L8-L11

In this function, you have the right idea with the list comprehension but a flawed implementation.

First problem: on line 9, you are using the name names_titled and appending to it but it hasn't been declared yet so it doesn't "exist" yet.

The second, bigger, problem is names_titled is over complicating your list comprehension.

    names_titled = [name.title() for name in names]

The list comprehension will create a list full of title cased names and all you've got to do is assign that new list to a variable name. I really like the test you added to list comprehension on line 9, but again names_titled doesn't exist and even if it did, you run into the problem of "mutating" or changing a list as you are building it which you should avoid. It causes bugs that are very confusing and difficult to find.

Any time you hear the phrase "deduplication" or "dedup", you should think of the Python set data type. Sets are very useful!

    names_titled = [name.title() for name in set(names)]

Now we are not referencing a list that is changing as we build it.

The final problem is the return statement on line 10, in your version of the function there isn't a variable called names_titled. Sometimes, we don't need to assign results to variables and we can just return the results of other functions.

def dedup_and_title_case_names(names):
     return [name.title() for name in set(names)]

Sometimes it's convenient for debugging to assign it to an intermediate variable before returning.

The Perils of Sorting in Python

https://github.com/vinostroud/PDI_Bytes/blob/0ce686f9016cc90c0c98e7668742101a4b793928/Bite5_ParseNames_Lambda.py#L15-L17

There are a couple of different ways of sorting in Python and unfortunately it's confusing when you should use which way. The list class has a sort method, which you've used. This method will sort the list "in-place", changing the target list. This is another case of mutating a list, in this case the list passed into your function. It's considered "Bad Form™" to change function argument values in a function, so instead we use the other Python sorting mechanism called sorted. The sorted built-in function is similar to the list.sort method but instead of changing the input list, it creates a new list.

def sort_by_surname_desc(names): 

     # something goes here :)

    return sorted(deduped_names, key=lambda name: name.split()[-1], reverse=True)

As you can see, the lambda function used in the key parameter is the same as the sort method but now we are creating a new list from deduped_names that is sorted by the surname (and in descending order).

Another common pitfall with the list.sort method is that the method does not return anything, so something like this:

>>> result = names.sort()
>>> print(result)
None
>>> result = sorted(names)
>>> print(result)
[ "Alice", "Bob", "Charlie", "Dave" ]

A Hint for shortest_first_name

https://github.com/vinostroud/PDI_Bytes/blob/0ce686f9016cc90c0c98e7668742101a4b793928/Bite5_ParseNames_Lambda.py#L23-L25

This is so close! Your lambda function is great and your use of sorted is spot on. The Bite wants you to return the name with the shortest first name and your code is returning a list sorted by first name length. I'm going to let you work on this one a little more before spoiling it for you :)

Good work!

vinostroud commented 1 year ago

OK, I didn't realize I was to use the deduplicated list created in function 1 for functions 2 and 3!

This works and makes sense. The shortest_first_name() function probably could be consolidated but...it passes.

def sort_by_surname_desc(names):
    deduped_names = dedup_and_title_case_names(names)
    return sorted(deduped_names, key=lambda name: name.split()[-1], reverse=True)

def shortest_first_name(names):
    deduped_names = dedup_and_title_case_names(names)
    shorted_first_name_list = sorted(deduped_names, key=lambda name: len(name.split()[0]))
    shortest_full_name = shorted_first_name_list[0].split()
    return shortest_full_name[0]

Regarding the PyBites tests -- when it says eg assert Al ... does the word following 'assert` mean that's the expected output? Or is that the value passed into the test?

Thanks! Speak tomorrow (Friday)

JnyJny commented 1 year ago

Regarding the PyBites tests -- when it says eg assert Al ... does the word following 'assert` mean that's the expected output? Or is that the value passed into the test?

Great question. The assert builtin Python function is a little weird since it doesn't look like other functions like print and sum or sorted. Asserts are traditionally instructions that programmers put in programs to say "this fact is true. if it's not, crash the program".

>>> assert True
>>> assert False
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError

In the case of the tests referenced above, the asserts are checking the "truthiness" of the expression immediately following the assert keyword.

def test_sort_by_surname_desc():
    names = sort_by_surname_desc(NAMES)
    assert names[0] == 'Julian Sequeira'
    assert names[-1] == 'Alec Baldwin'

In this test, the two expressions names[0] == 'Julian Sequeira' and names[-1] == 'Alec Baldwin' are tested by the assert function and if they fail, an exception is raised that PyTest uses to tell us more about the failure that occurred. Asserts can go in regular programs too (not just tests), however I recommend against using them since they tend to crash programs which is jarring to the user.

def test_shortest_first_name():
    assert shortest_first_name(NAMES) == 'Al'

In this test, the expression is the shortest_first_name function called with the NAMES list and the return value is compared to 'Al'. The fun thing about most functions is they are deterministic, in that given a set of inputs they should always produce the same outputs. This only changes when you start adding things like random numbers into your functions.

vinostroud commented 1 year ago

Resolved, all tests passed and Assert function is more clear