airbytehq / airbyte

The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
https://airbyte.com
Other
15.47k stars 3.99k forks source link

[source-hubspot] New streams: Associations between objects #44554

Open kev-datams opened 3 weeks ago

kev-datams commented 3 weeks ago

Topic

Add new streams to be able to retrieve up-to-date associations with their labels between HS objects using official API endpoints

Relevant information

According to the official Hubspot documentation, there are endpoints dedicated to associations between objects. It would be interesting to collect associations only, with their labels, instead of retrieving them via usual object streams. Another advantage: associations may be dynamically retrievable for all object types via those endpoints (i.e. no need to include missing association types in existing objects streams).

Example of use case solving a current huge pain 🙏

  1. existing object streams used in incremental only retrieve associations at the time of data extraction
  2. if any association is created after the incremental sync, it is NOT retrieved (except if object itself is modified again, being in the scope of a subsequent incremental sync)
  3. as Engagements are mainly never modified once created, but possibly a posteriori associated to many objects, we miss a huge number of Engagements x Deals/Contacts/Companies associations ; we also can not consider re-syncing all engagements at a regular schedule because of the data volume

-> if we have new dedicated streams for associations between usual objects (eg: Company x Contact, Deal x Engagements call, etc...), we could enable a daily full-refresh sync for all associations only in an efficient manner and then use up-to-date associations data for our internal use cases 👍

NB: above dedicated streams for associations may even evolve to incremental sync also as soon as Hubspot enables Search it in associations API endpoints (the same way as for other objects)🤞

marcosmarxm commented 3 weeks ago

@airbytehq/dev-python can someone take a look into this Hubspot proposal?

girarda commented 3 weeks ago

That's a good idea. I'm adding this to our backlog, but I don't have a timeline to offer at the moment.

kev-datams commented 3 weeks ago

That's a good idea. I'm adding this to our backlog, but I don't have a timeline to offer at the moment.

@girarda, does no timeline at the moment mean not before 3 months for sure ? What is the frequency of backlog reviews to prioritize it in the future ?

Thanks !