def find_related_cols_by_content

NCATComp410 / comp410_summer2020

COMP410 summer project to prepare data for DFS

Apache License 2.0

2 stars 24 forks source link

def find_related_cols_by_content #14

Closed Jsostmann closed 4 years ago

Jsostmann commented 4 years ago

# path
#    Starting path location to begin search
#

claesmk commented 4 years ago

Which function are you implementing the starting path for? See this post - Team Pecan Cookies is working from the files mentioned in that post.

Jsostmann commented 4 years ago

ohhhh okay I see now. Thanks!

Jsostmann commented 4 years ago

# dataframe_list
#     List of pandas dataframe objects
#

dataFrame_List

claesmk commented 4 years ago

OK, this is a good start

Jsostmann commented 4 years ago

@claesmk Hey Mr. Mike, I had a quick question about ignoring files. I forked the Repo and when I try to push my changes, the trip_logs.csv file is too large. Would it be okay if I added *.csv to the .gitignore file?

claesmk commented 4 years ago

You bring up an excellent point @Jsostmann. When you first clone the repo there are a couple of .csv files included in the repo that are used for testing purposes.

It's not until you run the demo.py file that the remaining csv files get created. I did not check-in those on purpose because I expected everyone to run the demo and download them, I don't want those to get check-in to the repo.

If you ignore *.csv you will block upload of some csv files we might need to keep in the repo. What you could do is add just the files we expect to have downloaded (airlines.csv, airports.csv, flights.csv, and trip_logs.csv)

What you should do is open a new issue for this and in that issue just commit the changes to .gitignore

Jsostmann commented 4 years ago

Oh okay! so your saying that inside .gitignore just add

...
...
airlines.csv
airports.csv
flights.csv
trip_logs.csv

instead of *.csv because these are just the test files we assume everyone has downloaded locally already?

claesmk commented 4 years ago

I think you will need to add the full relative path - ie

data/trip_logs/trip_logs.csv

Jsostmann commented 4 years ago

Okay, Thanks for the clarification!