artefactory / artefactory-connectors-kit

ACK is an E(T)L tool specialized in API data ingestion. It is accessible through a Command-Line Interface. The application allows you to easily extract, stream and load data (with minimum transformations), from the API source to the destination of your choice.
GNU Lesser General Public License v3.0
42 stars 6 forks source link

As a User, I can launch a request knowing that all readers follow a common behavior on date parameters #67

Open gabrielleberanger opened 3 years ago

gabrielleberanger commented 3 years ago

WHY Today, the behavior of readers expecting date parameters (start_date, end_date, date_range, etc.) is not harmonized.

In particular, readers usually do not know how to prioritize the date parameters given as inputs. For instance: the Google Ads reader crashes if you try to make a request including the start_date and end_date parameters + a date_range parameter.

HOW Define a convention for all readers, and implement it.

ali-bellamlih commented 3 years ago

Hello !

So I've been working a bit on this issue. I have taken a look at all the readers in order to understand how they work and how they manage the date range option. First, here is a basic overview :

- Adobe Reader:

- Adobe Reader v2:

- DBM Reader :

- DCM Reader :

- Facebook Reader :

- Ga Reader :

- Googleads reader :

- radarly_reader :

- SA360reader, SearchConsoleReader, ttd_reader, twitter_ready :

- yandex_statistics_reader :

So here's a proposition to manage date ranges in a uniform way for all readers :

  1. Some readers already do it, others don't : define a function validate_dates that raises an exception when start_date is older than end_date

  2. The date_range can be defined in 3 different and hypothetical ways :

    • Given both a start_date and an end_date
    • Given an end_date and a generic date_range parameter often called day_range in the function
    • Implicitly by not informing start_date and end_date and giving a day_range
  3. An exception must be raised when :

    • Both start date and end date are not defined, no custom date range can be defined
    • A generic date range is defined (like "PREVIOUS_DAY") as well as a start_date and an end date
    • Only the end_date or start_date and day_range are defined (Second point of the previous bullet point. We get rid of this possibility for clarity and practicality purposes)

Should we create unique functions in the helpers module and make all readers import it ? Is it feasible and practical ? Let me know what you think.

benoitgoujon commented 3 years ago

Hello @ali-artefact,

Thank you for this overview of date management in NCK and your suggestions.

I agree with your first point. First, it is ok to duplicate code but it will be one of the points to be tackled in the incoming refactoring.

Regarding your second and third points, one thing you pointed out in your overview is how a couple of readers use default values (especially the DBM reader). My take is that default values should not exist at all. We want the user to explicitly define the period of time of the API request.

So I suggest to remove all default values (set as a parameter of click options or directly in the code) and then make a custom validation function that follows the requirements you proposed.

Finally, to be sure I've understood correctly your proposal, do you want to raise an exception if the case _Given an end_date and a generic date_range parameter often called dayrange in the function happens?

ali-bellamlih commented 3 years ago

Thank you for your prompt answer @benoitgoujon,

I agree with you, as discussed, all implicit behaviours should be deleted.

Regarding your last concern, I reckon I should have explicitly given the sentence I was referring to. Anyways, you got it right. Although theoretically this could be a solution, I assume the less possibilities are given to the user, the better it is.

benoitgoujon commented 3 years ago

Yes absolutely!

So to summarize, here are our technical specs (correct me if I'm wrong):

The user can define the date range in 2 ways:

One way does not override the other. If there is a conflict, we throw an exception. All other cases that do not match the two conditions previously defined will result in an exception being thrown.

Does that sound good to you?

gabrielleberanger commented 3 years ago

@ali-artefact thank you very much for your detailed overview of the current behavior of the different readers.

I am aligned with the specs you just defined together, and the fact of removing default behaviors regarding dates, which can be confusing.

Good job guys! 👍

ali-bellamlih commented 3 years ago

Then we're all set 👍

tom-grivaud commented 3 years ago

@ali-artefact @gabrielleberanger @benoitgoujon Hi guys, while developing the mytarget connector, I asked myself few questions regarding the day ranges. Apparently, these are made to match a specific API parameter. Nevertheless, it could be usefull to allow the use of these pre-defined day ranges for other platforms as it makes it easier for humans to know which range is selected instead of the dates themselves which can be tricky especially as months and days can switch from one country to another.

Therefore, what do you think about adding a new layer allowing the use of day ranges for api which normally don't allow day ranges as parameters ?

One idea could be to add a function which will set start date and end date based on the current date and the range specified by the user. I guess this kind of solution also has drawbacks so let me know if it would make sense to you in this case.

benoitgoujon commented 3 years ago

Hi @tom-grivaud,

To me, this is fine and that could bring value to the user indeed.

I think, this is especially relevant in the case you do not generate the commands dynamically and you don't have an orchestrator to populate the date parameters at runtime.

So, I would vote in favour of this new feature and I can develop it if you want.

tom-grivaud commented 3 years ago

Alright, what do you guys think @gabrielleberanger @ali-artefact ? I'd love to do the feature by myself but I already have too many tasks on for now so feel free to take the responsibility for this and I'll be glad to review your PR if Gabrielle and Ali validate the feature.

gabrielleberanger commented 3 years ago

@tom-grivaud as I told you earlier, I think it's a great idea too !

I am creating a dedicated issue for it. Thank you @benoitgoujon for taking it :ok_hand:

benoitgoujon commented 3 years ago

@ali-artefact one note before you jump into the code, the behaviour of the Adobe v2 reader has slightly evolved after @tom-grivaud's suggestion.

So, what you described is no longer up to date. There is the possibility to add a date range now. The rest is still valid 🙂

ali-bellamlih commented 3 years ago

Duly noted ! No problem.