Closed sonali-sr closed 1 year ago
Hi Professor - I am having trouble extracting groups from re.search. I could do it for the example, but for the full list, I cannot extract the group 1 match (company name). The list comprehension returns the match: <re.Match object; span=(0, 61), match='COUNTY FAIR FARM (COMPANY) AND ANDREW WILLIAMSON > . What am I missing?
Hi Professor - I am having trouble extracting groups from re.search. I could do it for the example, but for the full list, I cannot extract the group 1 match (company name). The list comprehension returns the match: <re.Match object; span=(0, 61), match='COUNTY FAIR FARM (COMPANY) AND ANDREW WILLIAMSON > . What am I missing?
This is just because debar_3c1 is the entire list made with your list comprehension, it's all the elements inside it that are either re.Match type or NoneType.
@FanniVarhelyi : The answer provided by @brad-wayne is correct. So you would have to find a way to access the elements inside the list. You can do a print statement to check if debar_3c1 produces the result you want - just to be sure.
3. Optional extra credit 1: regex to separate companies from individuals
C. Iterate over the
name_clean
column in debar and use regex to create two new columns indebar:
co_name
: A column for company (fullname_clean
string if no match; pattern before COMPANY if one extracted)ind_name
: A column for individual (fullname_clean
string if no match; pattern before INDIVIDUAL if one extracted)D. Print three columns for the rows in debar containing the negative example and positive example described above (county fair farm and cisco produce):
name_clean
co_name
ind_name
Violation