stakwork / stak-bounties

0 stars 2 forks source link

Python Script to create a JSON Report from CSV #1

Closed Edvardoh closed 1 year ago

Edvardoh commented 1 year ago

We need to create a custom JSON Report from Stakwork Runs exported from our database.

We have extracted a sample of the Runs table into a csv called query-results.csv, as well as the schema, and an example JSON output.

All of the necessary materials can be found in this zip folder: glue.zip

Deliverables:

Example JSON Format

{
    "4119189": { // PROJECT ID
        "stakwork_info": {  // CUSTOM
            "media_url": "https://stakwork-uploads-dev.s3.amazonaws.com/ballot_verification_images/sample_distritales/0/A005143.JPG", // FROM projects -> workflow_output -> set_var -> media_url
            "run_url": "https://jobs.stakwork.com/admin/projects/${PROJECT_ID}", // ${PROJECT_ID} FROM projects -> id
            "sats_spent": 440, // SUM(amount) FROM jobs through job_containers -> project_folders -> projects
            "jobs_created": 22, // COUNT(*) FROM jobs through job_containers -> project_folders -> projects
            "aws_ocr_count": 4, // count all step usage in grandchildren
            "stak_ocr_count": 15, // count all step usage in grandchildren
            "donut_vqa_count": 11 // count all step usage in grandchildren
        },
        "barcode": "005143", // barcode output
        "anio": 2019,
        "eleccion": "3 - Distrital", // use set_var.output.eleccion if available, else derive from filename according to existing rules
        "meta": [ // each of extract_meta loop
            {
                "text": "departamento",
                "value": 1
            },
            {
                "text": "municipio",
                "value": 1
            }
        ],
        "totals": [ // each of extract_totals loop
            {
                "text": "VOTOS NULOS",
                "value": 3
            },
            {
                "text": "VOTOS BLANCOS",
                "value": 1
            }
        ],
        "codigos": [ // each of extract_parties loop
            {
                "partido": "TODOS",
                "votos": 10,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/10bcbdf3-34a5-461f-976a-7bb93b53b48d/10bcbdf3-34a5-461f-976a-7bb93b53b48d.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121138",
                    "bigger_text": {
                        "bigger_crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/84233d7f-af8f-458b-afff-50db64bc525a/84233d7f-af8f-458b-afff-50db64bc525a.JPG",
                        "min_confirmations_reached": true,
                        "jobs_completed": 2
                    }
                }
            },
            {
                "partido": "LIBRE",
                "votos": 3,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/85845376-34b8-4866-abb8-3b0ace79e7d5/85845376-34b8-4866-abb8-3b0ace79e7d5.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121216",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "FCN-NACION",
                "votos": 2,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/aea7b88a-f1f2-4e8f-86ed-63e71ffa164e/aea7b88a-f1f2-4e8f-86ed-63e71ffa164e.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121166",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "BIEN",
                "votos": 4,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/5b00212c-2dcd-4c28-9a96-d73c1ca2b6f6/5b00212c-2dcd-4c28-9a96-d73c1ca2b6f6.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121151",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "SEMILLA",
                "votos": 24,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/78c133bc-f8ea-4c56-85aa-798e7dc3708e/78c133bc-f8ea-4c56-85aa-798e7dc3708e.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121213",
                    "bigger_text": {
                        "bigger_crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/4089abc2-9d05-4ff8-9562-0c269b98850e/4089abc2-9d05-4ff8-9562-0c269b98850e.JPG",
                        "min_confirmations_reached": true,
                        "jobs_completed": 2
                    }
                }
            },
            {
                "partido": "PC",
                "votos": 2,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/2fe84f14-3ff7-413a-9e50-e6ea8e99ef99/2fe84f14-3ff7-413a-9e50-e6ea8e99ef99.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121207",
                    "bigger_text": {
                        "bigger_crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/a41e1975-f5e5-46f5-aff4-9d8d18abf7d0/a41e1975-f5e5-46f5-aff4-9d8d18abf7d0.JPG",
                        "min_confirmations_reached": true,
                        "jobs_completed": 2
                    }
                }
            },
            {
                "partido": "MLP",
                "votos": 9,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/a7af8895-154a-4239-8e0c-527a64b1b856/a7af8895-154a-4239-8e0c-527a64b1b856.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121203",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "PHG",
                "votos": 16,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/c672f8f9-3205-4c9f-932c-fa2107d3170c/c672f8f9-3205-4c9f-932c-fa2107d3170c.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121201",
                    "bigger_text": {
                        "bigger_crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/029035f8-ae3a-4dad-b026-01ce3200ade2/029035f8-ae3a-4dad-b026-01ce3200ade2.JPG",
                        "min_confirmations_reached": true,
                        "jobs_completed": 2
                    }
                }
            },
            {
                "partido": "AVANZA",
                "votos": 2,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/93279d3a-5fee-41e1-a325-4971904d5d5c/93279d3a-5fee-41e1-a325-4971904d5d5c.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121199",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "VAMOS",
                "votos": 19,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/d76e31ad-15a8-4a95-b0d9-77a513aed515/d76e31ad-15a8-4a95-b0d9-77a513aed515.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121196",
                    "bigger_text": {
                        "bigger_crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/70260b15-026e-4ab7-8ea3-827051d29516/70260b15-026e-4ab7-8ea3-827051d29516.JPG",
                        "min_confirmations_reached": true,
                        "jobs_completed": 2
                    }
                }
            },
            {
                "partido": "PPT",
                "votos": 4,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/759b5e57-8123-4281-8932-cdc388789cb9/759b5e57-8123-4281-8932-cdc388789cb9.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121194",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "UNIDOS",
                "votos": 1,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/385c2b3a-1cff-4873-b392-49004cb4efbd/385c2b3a-1cff-4873-b392-49004cb4efbd.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121189",
                    "bigger_text": {
                        "bigger_crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/323107c2-6e4f-4c7f-9047-f5e01b44b38b/323107c2-6e4f-4c7f-9047-f5e01b44b38b.JPG",
                        "min_confirmations_reached": true,
                        "jobs_completed": 2
                    }
                }
            },
            {
                "partido": "FUERZA",
                "votos": 2,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/8b56ef90-74e3-4c8c-b4f2-d0496a2a763e/8b56ef90-74e3-4c8c-b4f2-d0496a2a763e.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121183",
                    "bigger_text": {
                        "bigger_crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/2644d3f0-a75d-4205-89dd-4c7d3c100912/2644d3f0-a75d-4205-89dd-4c7d3c100912.JPG",
                        "min_confirmations_reached": true,
                        "jobs_completed": 2
                    }
                }
            },
            {
                "partido": "WINAQ",
                "votos": 24,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/a3a9fa03-1730-4b46-a85d-fa29eaee94f1/a3a9fa03-1730-4b46-a85d-fa29eaee94f1.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121178",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "VICTORIA",
                "votos": 3,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/1cb02fa6-2f59-456e-925e-d21eaec5637a/1cb02fa6-2f59-456e-925e-d21eaec5637a.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121176",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "CREO",
                "votos": 22,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/38d0299b-bbef-4af7-887b-0b71a68baccc/38d0299b-bbef-4af7-887b-0b71a68baccc.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121173",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "CONVERGENCIA",
                "votos": 6,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/7f99be6d-8148-4623-918f-a64c4cf2986d/7f99be6d-8148-4623-918f-a64c4cf2986d.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121168",
                    "bigger_text": {
                        "bigger_crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/3e62492d-1286-436c-813a-5444f73f8ec5/3e62492d-1286-436c-813a-5444f73f8ec5.JPG",
                        "min_confirmations_reached": true,
                        "jobs_completed": 2
                    }
                }
            },
            {
                "partido": "VIVA",
                "votos": 20,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/f6ca08f3-699c-4e77-b521-2d7c6707fbe3/f6ca08f3-699c-4e77-b521-2d7c6707fbe3.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121162",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "EG",
                "votos": 5,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/247f55a0-d055-45d4-b215-00bb3761d278/247f55a0-d055-45d4-b215-00bb3761d278.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121158",
                    "bigger_text": {
                        "bigger_crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/d9b4afdc-8979-4ee1-a0a7-e25a8fa7e532/d9b4afdc-8979-4ee1-a0a7-e25a8fa7e532.JPG",
                        "min_confirmations_reached": true,
                        "jobs_completed": 2
                    }
                }
            },
            {
                "partido": "UCN",
                "votos": 2,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/de4b400d-be0b-4128-b4ca-60c1ea4aa0dc/de4b400d-be0b-4128-b4ca-60c1ea4aa0dc.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121154",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "UNIONISTA",
                "votos": 12,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/a786dc14-4dcd-4faf-8a38-a83833f1ae73/a786dc14-4dcd-4faf-8a38-a83833f1ae73.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121147",
                    "bigger_text": {
                        "bigger_crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/a9ed8f44-8ef6-47e9-91b3-5cbb4888b123/a9ed8f44-8ef6-47e9-91b3-5cbb4888b123.JPG",
                        "min_confirmations_reached": true,
                        "jobs_completed": 2
                    }
                }
            },
            {
                "partido": "UNE",
                "votos": 8,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/300d5ee0-ca3e-4899-824f-3ec604db7397/300d5ee0-ca3e-4899-824f-3ec604db7397.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121146",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "URNG MAIZ",
                "votos": 6,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/889ade6b-2c7c-4838-b5b5-e518820d51db/889ade6b-2c7c-4838-b5b5-e518820d51db.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121144",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "PODEMOS",
                "votos": 5,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/eef57170-bafc-4c84-89fc-49ca6f27d4b6/eef57170-bafc-4c84-89fc-49ca6f27d4b6.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121140",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "VALOR",
                "votos": 8,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/b7b5d6e2-c44a-4d3e-8f94-5034b498ed34/b7b5d6e2-c44a-4d3e-8f94-5034b498ed34.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121136",
                    "bigger_text": {
                        "bigger_crop": null,
                        "min_confirmations_reached": null,
                        "jobs_completed": 0
                    }
                }
            },
            {
                "partido": "PAN",
                "votos": 17,
                "job_info": {
                    "crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/22ee2b9a-10a5-4c5e-88dc-c8d35dc4f091/22ee2b9a-10a5-4c5e-88dc-c8d35dc4f091.JPG",
                    "run_link": "https://jobs.stakwork.com/admin/projects/4121135",
                    "bigger_text": {
                        "bigger_crop": "https://stakwork-uploads.s3.amazonaws.com/uploads/customers/5149/crop/47a471bc-9fbe-4443-b5fb-cc6e925de9e7/47a471bc-9fbe-4443-b5fb-cc6e925de9e7.JPG",
                        "min_confirmations_reached": true,
                        "jobs_completed": 2
                    }
                }
            } // Do not include parties which have no values
        ]
    }
}

We also have Ruby on Rails Code which generates the same report, you can use this to help do the same in Python with the csv data.

# frozen_string_literal: true

class BallotVerificationReport
  attr_reader :report_settings
  attr_reader :error, :response

  ELECTION_TYPE = {
    '1': '1 - Presidente',
    '2': '2 - Nacional',
    '3': '3 - Distrital',
    '4': '4 - Municipal',
    '5': '5 - Parlamento'
  }.freeze

  OCR_SKILL_NAME = {
    '3433': 'donut_vqa',
    '3367': 'stak_ocr',
    '3301': 'azure_ocr',
    '694': 'aws_ocr'
  }.freeze

  def initialize(report_settings)
    @report_settings = report_settings
  end

  def generate
    validate!

    return self if error.present?

    workflow_id = report_settings.params[:workflow].to_i
    from_date = DateTime.strptime(report_settings.params[:created_at], '%m/%d/%Y').to_date
    to_date = DateTime.strptime(report_settings.params[:end_date], '%m/%d/%Y').to_date

    ActiveRecord::Base.connected_to(role: :reading) do
      projects = Project.includes(:children)
                        .where(workflow_id: workflow_id)
                        .where(workflow_state: 'completed')
                        .where("updated_at::date >= ?", from_date)
                        .where("updated_at::date <= ?", to_date)
                        .where(parent_id: nil)
                        .select("id, main_parent_id, name, workflow_output -> 'extract_parties' -> 'output' -> 'child' -> 'for_each_word_extract_value' -> 'output' as extract_parties, " \
                                "workflow_output -> 'extract_totals' -> 'output' -> 'child' -> 'for_each_word_extract_value' -> 'output' as extract_totals, " \
                                "workflow_output -> 'extract_meta' -> 'output' -> 'child' -> 'for_each_word_extract_value' -> 'output' as extract_meta, " \
                                "workflow_output -> 'extract_votos_validos' -> 'output' -> 'child' -> 'for_each_word_extract_value' -> 'output' as extract_votos_validos, " \
                                "workflow_output -> 'check_barcode' -> 'output' as barcode, " \
                                "workflow_output -> 'set_var' -> 'output' as set_var_output"
                                ).order(id: :asc)

      all_jsons = {}

      projects.find_each do |project|
        json = {
          stakwork_info: {},
          barcode: nil,
          anio: nil,
          eleccion: nil,
          meta: [],
          totals: [],
          codigos: []
        }

        set_var_output = project.set_var_output
        next if set_var_output.blank?

        extract_parties_output = project.extract_parties

        next if extract_parties_output.blank?

        extract_parties_output.each do |c_val, workflow_output|
          workflow_output = workflow_output.presence || c_val.values.first
          next if workflow_output.nil?

          name = workflow_output.dig('set_var', 'output', 'detected_item', 'name')
          party_name = name.tr('0-9', '').strip

          votes = workflow_output.dig('set_final_output', 'output', 'value').to_i

          crop = workflow_output.dig('crop', 'output', 'media_url')
          big_crop = workflow_output.dig('bigger_crop', 'output', 'media_url')
          big_min_conf = workflow_output.dig('bigger_text', 'extras', 'min_confirmations_reached')
          big_jobs_completed = workflow_output.dig('bigger_text', 'extras', 'jobs_completed').to_i

          party_votes = {
            partido: party_name,
            votos: votes,
            job_info: {
              crop: crop,
              bigger_text: {
                crop: big_crop,
                min_confirmations_reached: big_min_conf,
                jobs_completed: big_jobs_completed
              }
            }
          }

          json[:codigos] << party_votes
        end

        barcode = project.barcode.dig('results', 0, 'text')

        if barcode.blank?
          file_name = set_var_output['media_url'].split('/').last

          if set_var_output['year'] == 2019
            eleccion_type = file_name.tr('a-zA-Z.', '').last
          else
            split_file_name = file_name.split('-')

            eleccion_type = split_file_name.first.tr('a-zA-Z.', '')
          end
        else
          if set_var_output['year'] == 2019
            eleccion_type = barcode[barcode.length - 1]
          else
            eleccion_type = barcode[0]
          end
        end

        eleccion = set_var_output['eleccion'].presence || ELECTION_TYPE[eleccion_type.to_s.to_sym]

        json[:barcode] = barcode
        json[:anio] = set_var_output['year']
        json[:eleccion] = eleccion

        extract_totals_output = project.extract_totals || []

        extract_totals_output.each do |child_key, child_output|
          child_output = child_output.presence || child_key.values.first

          name = child_output.dig('set_var', 'output', 'detected_item', 'name')
          next if name.blank?

          votes = child_output.dig('set_final_output', 'output', 'value').to_i

          json[:totals] << {text: name, value: votes}
        end

        extract_votos_validos_output = project.extract_votos_validos || []

        extract_votos_validos_output.each do |child_key, child_output|
          child_output = child_output.presence || child_key.values.first

          name = child_output.dig('set_var', 'output', 'detected_item', 'name')
          next if name.blank?

          votes = child_output.dig('set_final_output', 'output', 'value').to_i

          json[:totals] << {text: name, value: votes}
        end

        extract_meta_output = project.extract_meta || []

        extract_meta_output.each do |child_key, child_output|
          child_output = child_output.presence || child_key.values.first

          name = child_output.dig('set_var', 'output', 'detected_item', 'name')
          next if name.blank?

          votes = child_output.dig('set_final_output', 'output', 'value').to_i

          json[:meta] << {text: name, value: votes}
        end

        extract_parties = project.children&.find_by(child_name: "#{project.name}_extract_parties")&.children&.pluck(:id) || []
        extract_totals = project.children&.find_by(child_name: "#{project.name}_extract_totals")&.children&.pluck(:id) || []
        extract_votos_validos = project.children&.find_by(child_name: "#{project.name}_extract_votos_validos")&.children&.pluck(:id) || []
        extract_meta = project.children&.find_by(child_name: "#{project.name}_extract_meta")&.children&.pluck(:id) || []
        jobs_completed = Job.by_project(extract_parties + extract_totals + extract_votos_validos + extract_meta).closed_jobs.select(:id, :amount)
        ocrs_completed = ProjectFolder.where(skill_id: [3301, 3433, 3367, 694], project_id: extract_parties + extract_totals + extract_votos_validos + extract_meta).group(:skill_id).count

        json[:stakwork_info][:media_url] = set_var_output['media_url']
        json[:stakwork_info][:run_url] = "https://jobs.stakwork.com/admin/projects/#{project.id}"
        json[:stakwork_info][:sats_spent] = jobs_completed.sum(:amount)
        json[:stakwork_info][:jobs_created] = jobs_completed.length

        ocrs_completed.each do |skill_id, count|
          json[:stakwork_info][OCR_SKILL_NAME[skill_id.to_s.to_sym].to_sym] = count
        end

        all_jsons[project.id] = json
      end

      @response = all_jsons.to_json
    end

    self
  end

  def validate!
    @error = 'Workflow is not specified.' if report_settings.params[:workflow].blank?
  end
end
jeremyikwuje commented 1 year ago

Trying to take it. Any connection codes?

Edvardoh commented 1 year ago

Hi @jeremyikwuje - thanks for the response. Are you referring to DB connection codes? There is a .zip file with a sample dump from the DB, so you should have all the info you need there to produce the Python script. Let me know if there is anything I might have missed..

Edvardoh commented 1 year ago

@gonzaloaune I think we need sample outputs for the same Runs of the other tables so they can get all the data like jobs, sats_spent, ocr counts, etc?

Got this question from BrainShark: I was already done extracting the main fields, what was only left was refactoring the code and just a little clarification with a certain ‘run_link‘ for the elections(under the Codigos list) and also how to go about the {‘aws_ocr_count‘, ’stak_ocr_count’, ’donut_vqa_count’} which are in the example.json file

Here's a link to the script on Google colab they're building: https://colab.research.google.com/drive/1wooF1GUeWbsijYAjzl0fu4xhGCTg8ubX?usp=sharing

Edvardoh commented 1 year ago

Closing this as we updated the spec to specifically use PySpark/AWS Glue: https://github.com/stakwork/stak-bounties/issues/2