Closed za158 closed 1 year ago
@za158 Rebecca and I just chatted. We can pull NAICS code off Refinitiv, but we'd have to do that one company at a time, and the question is is that propriety data. So the best option is probably just pulling GICS code from the Wikipedia page. For the ones that are not longer on the list this year, we can search for them separately.
For Fortune 500, you'd need to subscribe to their site. It was a dollar when I did it. Once you see the full list and their methodology, you can decide whether you want someone to click on each company and extract their industry.
Wow, there's really no other way? Don't we have Refinitiv API access? (I may be woefully out of date)
At a minimum, the Open PermID API has economic sector info and is CC-licensed. Maybe we should just use that. I'd rather avoid trying to smush S&P 500 GICS together with whatever Fortune did.
Duplicate of #120
General requirement - as many PARAT v2 companies as is feasible, including at a minimum all S&P 500/Global 500 companies, should have at least one associated high-level economic sector, such as healthcare, IT, manufacturing, etc. If possible we should avoid making judgments ourselves, rather just pull data from someone who has already done the sorting.
As for exactly how to do this, I think I would defer to @rggelles and @ngorluong, especially since I no longer have refinitiv access. The most obvious thing to try would be to pull what Refinitiv has for these companies via PermID - I assume at a minimum they will have NAICS/SIC for all publicly traded companies.
As a fallback we already have GICS for the S&P 500 (it's in the Wikipedia page the data originally came from)
Thoughts?
cf https://github.com/georgetown-cset/parat/pull/19#issuecomment-1658910026