HumanExposure / ChemicalExposure-SSC

2 stars 1 forks source link

Walmart data extraction - manual #147

Open kdionisio opened 5 years ago

kdionisio commented 5 years ago

Extract data manually for Walmart/CPCPdb products. If you have questions when looking at a pdf, you can always just skip and move to the next item!

  1. Navigate to http://factotum.epa.gov/datagroup/81/
  2. Click on the 'title' (e.g. item_###) for an entry that has a green check for extracted text but does not yet have a product created
  3. Open the pdf in a separate window by clicking the small 'pdf' icon in the upper right of the page
  4. Select the blue 'QA' button
  5. Select 'Edit composition records' button to enable editing
  6. Edit/correct the product name to match what is on the pdf (note any trailing numbers on the product name should be removed unless they are actually part of the product name on the pdf)
  7. Add the document date in the same format as it appears on the document (typically at the top, or in the header/footer)
  8. Add/edit ingredients. Fields to check/add/edit include:
    • Ingredient rank (i.e. the order of the ingredient as listed on the pdf, 1st ing., 2nd ing., etc)
    • Raw CAS
    • Raw chemical name
    • Raw composition (if a point value, put into the 'central comp' field; if a range, put into the min and max comp fields)
    • Unit type (this will most likely be 'percent')
    • functional use, if present (would be included in 'ingredients' section alongside chemical name/cas) *Note, if you need to add* more than 1 chemical, you will need to 'save edits', at which point a blank chemical record entry will appear; repeat until you have added all chemicals
  9. IMPORTANT select 'Save edits', NOT 'Approve'
  10. Select 'Exit, which will return you to the data document page. You should see the information you just entered present on the data document page.
  11. Select 'create new product' button and add all fields you have available.
    • Title (this should be product name)
    • Manufacturer (Typically you will see this in the header of the pdf but ok to leave blank)
    • Brand (ok to leave blank)
    • UPC (if present on pdf then enter, otherwise leave default 'stub'; note only enter ONE UPC here, if the pdf contains more than 1 UPC they will be entered as separate products)
    • Size (typically only present if the pdf includes a UPC)
    • Color (typically will not have this)
    • Data document type (e.g. MSDS, SDS, etc., usually stated at top of pdf)
  12. Select 'Save'
  13. Select link to 'Walmart - in CPDat' data group at the top of the data document page to begin again
kdionisio commented 5 years ago

Note if you wanted to click over to the product page and assign the PUC after you have created the product that would be fine too!