CrossroadsCX / opendata

3 stars 1 forks source link

Investigate Direct Transaction CSV Requests #13

Open cmbirk opened 2 years ago

cmbirk commented 2 years ago

It looks like the request information from downloading the csvs from the transactions search table may be able to be made directly rather than initially crawling the search interface. The following is an object that represents the csv request:

{
  encoding: null,
  method: 'POST',
  url: 'https://cf.ncsbe.gov/CFTxnLkup/ExportResults/',
  data: 'Params=%7B%22ReceiptType%22%3A%22%27GEN+%27%2C%27OTLN%27%2C%27IND+%27%2C%27PPTY%27%2C%27CPCM%27%2C%27LOAN%27%2C%27RFND%27%2C%27INT+%27%2C%27NFPC%27%2C%27OUTS%27%2C%27GNS+%27%2C%27FRLN%27%2C%27CNRE%27%2C%27LEFO%27%2C%27EPPS%27%2C%27DEBT%27%2C%27DON+%27%2C%27BFND%27%22%2C%22ExpenditureType%22%3A%22%27BFND%27%2C%27CCPC%27%2C%27CPE+%27%2C%27DEBT%27%2C%27IEXP%27%2C%27INTR%27%2C%27LNRP%27%2C%27NMG+%27%2C%27OPER%27%2C%27RFND%27%22%2C%22CommitteeType%22%3A%22%22%2C%22PartyType%22%3A%22%22%2C%22OfficeType%22%3A%22%22%2C%22CommitteeIDs%22%3Anull%2C%22CommitteeName%22%3A%22%22%2C%22Cities%22%3A%22%22%2C%22Counties%22%3A%22%22%2C%22State%22%3A%22%22%2C%22ZipCodes%22%3A%22%22%2C%22DateFrom%22%3A%2201%2F01%2F2021%22%2C%22DateTo%22%3A%2201%2F10%2F2021%22%2C%22OrganizationName%22%3A%22%22%2C%22FirstName%22%3A%22%22%2C%22LastName%22%3A%22%22%2C%22NameSoundsLike%22%3Afalse%2C%22NameIsOrg%22%3Afalse%2C%22Purpose%22%3A%22%22%2C%22AmountFrom%22%3A%22%22%2C%22AmountTo%22%3A%22%22%2C%22JobProfession%22%3A%22%22%2C%22JobProfSoundsLike%22%3Afalse%2C%22Employer%22%3A%22%22%2C%22EmployerSoundsLike%22%3Afalse%2C%22PaymentType%22%3A%22%22%2C%22Page%22%3A0%2C%22Debug%22%3Afalse%7D',
  headers: {
    'upgrade-insecure-requests': '1',
    origin: 'https://cf.ncsbe.gov',
    'content-type': 'application/x-www-form-urlencoded',
    'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/97.0.4691.0 Safari/537.36',
    referer: 'https://cf.ncsbe.gov/CFTxnLkup/TxnSearchResults/'
  }
}

and the decoded data string looks like:

Params={"ReceiptType":"'GEN ','OTLN','IND ','PPTY','CPCM','LOAN','RFND','INT ','NFPC','OUTS','GNS ','FRLN','CNRE','LEFO','EPPS','DEBT','DON ','BFND'","ExpenditureType":"'BFND','CCPC','CPE ','DEBT','IEXP','INTR','LNRP','NMG ','OPER','RFND'","CommitteeType":"","PartyType":"","OfficeType":"","CommitteeIDs":null,"CommitteeName":"","Cities":"","Counties":"","State":"","ZipCodes":"","DateFrom":"01/01/2021","DateTo":"01/10/2021","OrganizationName":"","FirstName":"","LastName":"","NameSoundsLike":false,"NameIsOrg":false,"Purpose":"","AmountFrom":"","AmountTo":"","JobProfession":"","JobProfSoundsLike":false,"Employer":"","EmployerSoundsLike":false,"PaymentType":"","Page":0,"Debug":false}

If we can make those requests directly it may be easier than going through the interface and waiting for the transactions table to load before downloading the bulk csv.