legalese / legalese.github.io

Static assets for legalese.com
https://legalese.com/
70 stars 27 forks source link

write a JS library to wrap https://www.questnet.sg/ #187

Open mengwong opened 7 years ago

mengwong commented 7 years ago

following on from #119,

given user credentials to an account with questnet that has a prepaid balance, the library should gateway business profile queries via a REST API and scrape the response in a well-documented JSON structure.

Use a well-respected REST API utility library for framework, rather than reinventing that wheel.

REQUIRES BADGES: javascript; scraping

jobchong commented 7 years ago

http://ref.data.gov.sg/UENfiles/UEN_DATAGOV.zip is hosted by the "old" data.gov.sg, and contains company UEN and name in the following format:

<DATA>
      <UEN>196800160G</UEN>
      <ISSUANCE_AGENCY_ID>ACRA</ISSUANCE_AGENCY_ID>
      <UEN_STATUS>R</UEN_STATUS>
      <ENTITY_NAME>TANGLIN HOTEL (PRIVATE) LIMITED</ENTITY_NAME>
      <ENTITY_TYPE>LC</ENTITY_TYPE>
      <ENTITY_STATUS>0</ENTITY_STATUS>
      <UEN_ISSUE_DATE>20080909</UEN_ISSUE_DATE>
 </DATA>
jobchong commented 7 years ago

Completed at https://github.com/legalese/legalese-google-app/commit/a7fbbfa2f3ae18f3697c720644abfe27a456e10b

Library located at legalese-google-app/questnet

jobchong commented 6 years ago

Update of this issue:

Have since coordinated with Proteus team for adapting this to the v2 stack. Opens questnet, searches for business profile by uen and pays through prepaid balance on questnet purchased by Legalese. Scrapes result frame and returns JSON. Can account for different nationalities/types of officers. Represents data accurately from the results questnet offers in the business profile search. Shareholder functionality is limited to the basic case, which is one type of share per shareholder.

jobchong commented 6 years ago

Script takes 9 - 11 seconds. We can cut it down by 2 seconds (removing both timeouts) if we filter out bad input at the request to the data.gov.sg api in v2. I'm trying to persist logins across script starts, which will cut another 2 - 3 seconds. Optimally I expect the script to take 5 seconds