src-d / ghsync

GitHub API v3 > PostgreSQL
https://sourced.tech
GNU General Public License v3.0
9 stars 8 forks source link

Status table to add possibility to use those data for progress chart #48

Closed se7entyse7en closed 5 years ago

se7entyse7en commented 5 years ago

Closes #31.

This PR adds the status table as follows:

ghsync=# select * from status;
 id |   org    |    part    | done | failed | total
----+----------+------------+------+--------+-------
  2 | bblfsh   | user       |    0 |      0 |
  3 | viralize | repository |    0 |      0 |
  4 | viralize | user       |    0 |      0 |
  1 | bblfsh   | repository |    8 |      0 |    32

So for each organization, we can track separately different steps so that we can have more fine-grained progress information. On the other hand, we can just group by org and sum to have overall progress. Unfortunately, with the current architecture, the total number of resources is not available, so if we want to show the overall progress it will just go up and down as the total number changes over time. On the contrary, every single row can show the correct progress or be in a pending state that is identified by the total column being NULL.

Then on UI we can use this table for charting the progress, or at first even just print this table as-is (maybe with just a little guide on the meaning).

Some parts are copy-pasted and could be refactored, I just tried to close the issue without very big thing to fix.

se7entyse7en commented 5 years ago

@dpordomingo I completely agree with you. But I think that it has been already discussed somewhere the possibility of being fully error-resilient instead of just exiting at first error seen. Regarding the kallax thing, yes I've just gone with the fastest way given the deadline and given my complete ignorance with kallax, and applied the do and iterate approach in mind. And definitely with this table, we can already show something useful to the user.

Regarding the failed counter I'm gonna take a look into it.

se7entyse7en commented 5 years ago

@carlosms in the end I also include the user part here as it was straightforward to add it.

carlosms commented 5 years ago

Also, now that we have a table storing the progress of the process, why not being failure resilient, and not stopping if we can try with a different document, org?

In my opinion these are not related, and it will be better if we just limit this PR to the table counters, and address error handling separately.

se7entyse7en commented 5 years ago

In my opinion these are not related, and it will be better if we just limit this PR to the table counters, and address error handling separately.

Agree.

dpordomingo commented 5 years ago

Again: not trying to block this merge, which LGTM, so it'll be great if we move on, and iterate later. Just tried to document my concerns in the PR, in order to be considered in the future when improving the import process.

smacker commented 5 years ago

Imo there are some very sane concerns raised by David. Let's at least discuss them before merge. And better fix.

se7entyse7en commented 5 years ago

@dpordomingo I'm merging, but please feel free to open issues if you think that some points should be improved 👍