bitcointranscripts / transcription-review-backend

7 stars 11 forks source link

feat: add review status to API response #296

Open Extheoisah opened 3 months ago

Extheoisah commented 3 months ago

closes #294 and #295

This pr adds review status to the API response when a review detail is fetched from the db. This is done on the API level and not in the database level. Also, a new "rejected" status has been added based on this criteria.

kouloumos commented 2 months ago

@Extheoisah how did you test this?

I'm running this branch locally, but using the staging database and redis instance.

I'm using the /api/reviews/all endpoint in swagger and I'm filtering by status. I've tested with all the available statuses (rejected is missing btw from swagger) and I can't see a status field for the review in the response.

Edit: I wasn't using the correct setup, that's why it didn't work. I wanted to use the staging db, but I forgot that by having NODE_ENV=staging the swagger is pointing to the staging instance. Therefore I wasn't testing against the actual PR.

Extheoisah commented 2 months ago

@Extheoisah how did you test this?

I'm running this branch locally, but using the staging database and redis instance.

I'm using the /api/reviews/all endpoint in swagger and I'm filtering by status. I've tested with all the available statuses (rejected is missing btw from swagger) and I can't see a status field for the review in the response.

I tested using my local db. When you query the /api/reviews/all endpoint, you should see the statuses of all reviews added to each review result. I forgot to update the swagger docs, so I'll include that.

Extheoisah commented 2 months ago

@kouloumos updated the code to include the "rejected" status in Swagger docs and also added it to the buildCondition function so you can pass it as a parameter in the review status query

Extheoisah commented 2 months ago

There is something that doesn't feel right with the "duplicate" calculation of status, once on the query using buildCondition() and once for each result with computeReviewStatus().

I consulted my personal assistant ( 😎 ) for some ideas on how we can avoid this. Its suggestion was to either use triggers or create another view that will derive the status. It seems that both of those ideas get us back to having a single source of truth and simplify the implementation. The chat discussion offers more details about those alternative implementation directions. Please review the chat and let me know if these suggestions make sense to you

Additionally, I believe removing the notion of "archived" reviews could simplify our process. Currently, when querying for expired reviews, we must check both "archived" and "not archived" statuses. However, the expired status is already determined by whether a certain amount of time has passed. Thus, the "archived" status seems redundant. It might become useful for implementing different limits for different reviewers (see #286), but even then, we would still rely on createdAt to determine the timeframe.

This was initially my suggestion qhen this issue was raised https://github.com/bitcointranscripts/transcription-review-backend/issues/295#issuecomment-2153657103 but you didn't agree with it. I'm glad after the 1st implementation following your suggestion you agree that the status column makes things easier.

As regards removing the "archivedAt" column, the reasons you provided are not concrete enough for me to agree with removing the column. Remember we check for both "archived" and "not archived" statuses becuase of the cron job that runs every 24hrs to update the review status

kouloumos commented 2 months ago

This was initially my suggestion qhen this issue was raised https://github.com/bitcointranscripts/transcription-review-backend/issues/295#issuecomment-2153657103 but you didn't agree with it. I'm glad after the 1st implementation following your suggestion you agree that the status column makes things easier.

In my defense, at the time, you didn't provide a concrete way to do that and after I explained my thought process and how I was thinking about the downsides of adding the new column you agreed.

So, what do you think about the proposed solutions for implementing the status field?

As regards removing the "archivedAt" column, the reasons you provided are not concrete enough for me to agree with removing the column. Remember we check for both "archived" and "not archived" statuses becuase of the cron job that runs every 24hrs to update the review status

What I am saying is that the cron jobs that runs every 24h to archive reviews doesn't offer any benefit as on both checks ( "archived" and "not archived" statuses) we are also checking if we are in the 24h hours timeframe from the review's creation. So in a scenario where the archivedAt check is removed, nothing would change and we will have a single check that checks that a review:

  [Op.and]: {
      mergedAt: { [Op.eq]: null }, // has not been merged
      submittedAt: { [Op.eq]: null }, // has not been submitted
      createdAt: { [Op.lt]: timeStringAt24HoursPrior }, // expired

Does that make sense?

Emmanuel-Develops commented 2 months ago

Some context here:

We wanted inferred states rather than declaring them. The archivedAt field helps achieve this.

Initially, I started with cron jobs to manage the queuing, but cron is in memory so whenever the process restarts (server restarts) we lose the cron/timing (next 24 hrs from createdAt).

The dailyCheckForMissedExpiredReviews was a fallback incase of uncaught expired reviews.

I switched to bull with Redis to manage the queueing process instead of cron, which solved the issue with uncaught expired reviews. The expiryQueue manages the queue process for each review.

In hindsight, I don't think the dailyCheckForMissedExpiredReviews is necessary anymore as Redis manages the queue. It is not detrimental and in some very niche cases could help. I'm indifferent regarding its removal.

cc: @Extheoisah @kouloumos

Extheoisah commented 2 months ago

So, what do you think about the proposed solutions for implementing the status field?

I agree that's the way forward. I'll work something out and see.

What I am saying is that the cron jobs that runs every 24h to archive reviews doesn't offer any benefit as on both checks ( "archived" and "not archived" statuses) we are also checking if we are in the 24h hours timeframe from the review's creation. So in a scenario where the archivedAt check is removed, nothing would change and we will have a single check that checks that a review:

We can remove the cron job. The cron job was added a while back because we had one or two cases of non-archived but expired reviews. So the cron job was a fallback to catch those leaks. @Emmanuel-Develops would have more context to help us determine if we can remove it without any issues. But regarding removing the archivedAt column, we'll need to take care of this thoroughly. The archivedAt field is used in the webhook controller to timestamp when a review is closed but not merged and to update a transcript status when a review is merged. To bypass this, we can add two new status field "closed" or "archived" and "merged" to the new status enum column ("archived", "expired", "pending", "active", "rejected", "merged") so that these serve as the source of truth. But having the timestamp instead of just the status is relevant data for us. What do you think?

kouloumos commented 2 months ago

I agree that's the way forward. I'll work something out and see.

:rocket:

But regarding removing the archivedAt column, we'll need to take care of this thoroughly. The archivedAt field is used in the webhook controller to timestamp when a review is closed but not merged and to update a transcript status when a review is merged.

You are right, I forgot that we are using archivedAt to identify a rejected review. Then it doesn't make sense to remove it. Then my suggestion becomes to just remove the archivedAt check when checking for expired reviews.