georgetown-cset / parat

🦜 PARAT: CSET's Private-sector AI-Related Activity Tracker
https://parat.cset.tech
Other
5 stars 0 forks source link

A ton of companies that are in mature stage labelled "Unknown" #293

Closed ngorluong closed 1 month ago

ngorluong commented 3 months ago

Companies like Mitsubishi, BMW, PWC, etc are in the mature stage but their stages are labelled unknown

jmelot commented 2 months ago

@rggelles could you please take a look at this one? I do get a null stage from select stage from ai_companies_visualization.visualization_data where lower(name) = "mitsubishi" and 434 companies have a null stage in that table

rggelles commented 2 months ago

I've added a PR for this that will affect about 100 of these -- once it's reviewed it'll have to be pushed to airflow.

Basically, it looks like I had the code set up such that if the funding table in crunchbase didn't have the uuid for the crunchbase entry (that is, if the org had no funding whatsoever in crunchbase), it didn't correctly also check the organizations table to see if it was a very large organization anyway.

I don't know what the deal is with the other 300 organizations; possibly it just has to do with how we're defining stages and maturity. If you want to give me some examples I'm happy to take a look -- the mitsubishi example is fixed by this.