airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
16.2k stars 4.14k forks source link

Source Greenhouse: Candidates stream does not contains custom fields #29892

Closed robertoarnetoli closed 1 year ago

robertoarnetoli commented 1 year ago

Connector Name

source-greenhouse

Connector Version

0.4.2

What step the error happened?

Other

Revelant information

In the Greenhouse developer documentation (https://developers.greenhouse.io/harvest.html#the-candidate-object), you can see that the Candidates stream includes the custom_fields object, but this is not present in the fields shown in the Airbyte UI or extracted by the connector (even when selecting to extract all fields).

Here is an example of an extracted record missing the custom_fields { "_airbyte_ab_id": "0ce6d1c7-2584-41f9-8efd-adec3b0bc9be", "_airbyte_data": { "addresses": [...], "application_ids": [...], "applications": [...], "attachments": [...], "can_email": true, "company": "Therapeutic Health", "coordinator": null, "created_at": "2023-04-28T16:58:44.354Z", "educations": [...], "email_addresses": [...], "employments": [...], "first_name": "John", "id": 123456789012, "is_private": true, "last_activity": "2023-08-17T18:48:40.844Z", "last_name": "Smith", "phone_numbers": [...], "photo_url": null, "recruiter": {...}, "social_media_addresses": [...], "tags": [...], "title": "Therapist", "updated_at": "2023-08-17T18:48:40.845Z", "website_addresses": [...] }, "_airbyte_emitted_at": 1693171071583 }

Relevant log output

No response

Contribute

robertoarnetoli commented 1 year ago

Hi @marcosmarxm @malikdiarra @johnlafleur.

I believe the issue can be resolved by simply updating the schema airbyte-integrations/connectors/source-greenhouse/source_greenhouse/schemas/candidates.json which does not include custom_fields and keyed_custom_fields.

It just needs to be added the following properties:

"custom_fields": {
    "properties": {}, 
    "type": "object"
},
"keyed_custom_fields": {
    "properties": {},
    "type": "object"
},

You can see that the Greehouse stream returns the missing fields here. Also, by using the Airbyte Connector Builder, you can detect the updated schema (here attached). updated_candidates_schema.json.txt

Any chance of speeding up this bug fix? Thank you

marcosmarxm commented 1 year ago

@robertoarnetoli you're welcome to submit the fix! Right now this isn't in the team roadmap. If you don't have the tech skills there are some chances someone take this in Hackatoberfest and submit a fix! :)

robertoarnetoli commented 1 year ago

@marcosmarxm I did it! let's hope 🤞

robertoarnetoli commented 1 year ago

Hi @marcosmarxm I have made the suggested changes. Please review.