verida / data-connector-server

1 stars 2 forks source link

Add youtube handler inside google provider #83

Closed chime3 closed 2 months ago

chime3 commented 3 months ago

What's changed?

How to test these changes

Worked on these tests, but pre-built tests are broken now and need to discuss before make changes.

Anything to be aware of?

chime3 commented 2 months ago

@chime3 See my review comments.

The tests aren't passing. Please share screenshots of all tests passing when submitting this PR for review.

youtube-favourite-test youtube-following-test youtube-post-test Modified common.tests.ts @tahpot

chime3 commented 2 months ago

Test results

youtube-post: Appears to be working. I only have two youtube uploads so had to tweak the tests a bit locally

youtube-following: Failing: AssertionError [ERR_ASSERTION]: Sync is still active

youtube-favourite: Passing

Sync

I tried syncing via the command line and the code produces save errors indicating the schema's aren't being populated correctly.

I have made a fix in the develop branch. Can you merge develop, resole any issues and then run sync from the command line to confirm data is being saved correctly.

Other

I made some minor commits to improve the tests, see my commits.

I got it, thank you

chime3 commented 2 months ago

Test results

youtube-post: Appears to be working. I only have two youtube uploads so had to tweak the tests a bit locally

youtube-following: Failing: AssertionError [ERR_ASSERTION]: Sync is still active

youtube-favourite: Passing

Sync

I tried syncing via the command line and the code produces save errors indicating the schema's aren't being populated correctly.

I have made a fix in the develop branch. Can you merge develop, resole any issues and then run sync from the command line to confirm data is being saved correctly.

Other

I made some minor commits to improve the tests, see my commits.

I merged develop into feature/youtube-handler and run test passing. image

Please try to reset connection if AssertionError [ERR_ASSERTION]: Sync is still active error persists.

tahpot commented 2 months ago

Thanks, here's my feedback:

  1. youtube-following now works. The issue was I only had 6 subscriptions, but needed 7 for a third page of results to exist and the sync to remain set to "active". Can you add notes on what you need to do to your youtube account for the tests to pass in the providers/google/README.md file? ie: Youtube following needs at least 7 subscriptions, with one created within the last 24 hrs for tests to pass.
  2. Sync is now saving data
  3. There is a bug with all the handlers not tracking the position correctly. If you run the sync multiple times, it re-processes the existing records and re-saves them. The purpose of the sync position is to only process unprocessed items.
chime3 commented 2 months ago

Thanks, here's my feedback:

  1. youtube-following now works. The issue was I only had 6 subscriptions, but needed 7 for a third page of results to exist and the sync to remain set to "active". Can you add notes on what you need to do to your youtube account for the tests to pass in the providers/google/README.md file? ie: Youtube following needs at least 7 subscriptions, with one created within the last 24 hrs for tests to pass.

I added Prerequisites in README.md file.

chime3 commented 2 months ago
  1. Sync is now saving data
  2. There is a bug with all the handlers not tracking the position correctly. If you run the sync multiple times, it re-processes the existing records and re-saves them. The purpose of the sync position is to only process unprocessed items.

I referred to gmail handler. Is gmail sync not implemented correctly as well? @tahpot

tahpot commented 2 months ago
  1. Sync is now saving data
  2. There is a bug with all the handlers not tracking the position correctly. If you run the sync multiple times, it re-processes the existing records and re-saves them. The purpose of the sync position is to only process unprocessed items.

I referred to gmail handler. Is gmail sync not implemented correctly as well? @tahpot

Good question!

I haven't actually reached the end of my gmail, so it's possible it doesn't work correctly and I hadn't yet noticed.

I'll approve this PR and we can address this problem in the Google provider separately.

I have built a different process for telegram which I will port across to gmail, youtube and then make sure this problem is also solved.

tahpot commented 2 months ago

@chime3 I have made some changes to the underlying schemas.

Please review these changes I have made in another branch and update the youtube handlers to match:

  1. Add a syncMessage into the handler that will be displayed to users to help them understand the current state of processing. https://github.com/verida/data-connector-server/pull/90/commits/3bf6aa5aba27b68afc2b42415e6295962b07f675
  2. The enum status for the progress have changed, see here: https://github.com/verida/data-connector-server/pull/90/commits/ccf8c973d632f0c31315aac2f01f0983263e8814

Note: You may want to merge my branch into your branch so you get the new enum values.

As I had to deploy the new schemas to work on my code, it has broken your code as it expects the old schemas. So you'll need to fix this before your code works again.

tahpot commented 2 months ago
  1. Sync is now saving data
  2. There is a bug with all the handlers not tracking the position correctly. If you run the sync multiple times, it re-processes the existing records and re-saves them. The purpose of the sync position is to only process unprocessed items.

I referred to gmail handler. Is gmail sync not implemented correctly as well? @tahpot

It turns out there was a bug in the core handler that caused this issue which I have now fixed in the other branch.

If you merge the branch (as mentioned above) it should fix the issue.... however you will also need to delete all your connections and start again as they will be corrupt with incorrect data caused by the bug.

chime3 commented 2 months ago

There is a conflict that needs to be fixed, then this is approved for merging

Solved

tahpot commented 2 months ago
  1. Sync is now saving data
  2. There is a bug with all the handlers not tracking the position correctly. If you run the sync multiple times, it re-processes the existing records and re-saves them. The purpose of the sync position is to only process unprocessed items.

I referred to gmail handler. Is gmail sync not implemented correctly as well? @tahpot

It turns out there was a bug in the core handler that caused this issue which I have now fixed in the other branch.

If you merge the branch (as mentioned above) it should fix the issue.... however you will also need to delete all your connections and start again as they will be corrupt with incorrect data caused by the bug.

I can confirm that my fix now means the youtube handlers don't save the same item more than once.

tahpot commented 2 months ago

There is a conflict that needs to be fixed, then this is approved for merging

Solved

You need to merge this branch: https://github.com/verida/data-connector-server/issues/89

chime3 commented 2 months ago

There is a conflict that needs to be fixed, then this is approved for merging

Solved

You need to merge this branch: #89

Yeah, I got it

chime3 commented 2 months ago
  1. Sync is now saving data
  2. There is a bug with all the handlers not tracking the position correctly. If you run the sync multiple times, it re-processes the existing records and re-saves them. The purpose of the sync position is to only process unprocessed items.

I referred to gmail handler. Is gmail sync not implemented correctly as well? @tahpot

It turns out there was a bug in the core handler that caused this issue which I have now fixed in the other branch. If you merge the branch (as mentioned above) it should fix the issue.... however you will also need to delete all your connections and start again as they will be corrupt with incorrect data caused by the bug.

I can confirm that my fix now means the youtube handlers don't save the same item more than once.

Okay, let me check after merge

chime3 commented 2 months ago
  1. Sync is now saving data
  2. There is a bug with all the handlers not tracking the position correctly. If you run the sync multiple times, it re-processes the existing records and re-saves them. The purpose of the sync position is to only process unprocessed items.

I referred to gmail handler. Is gmail sync not implemented correctly as well? @tahpot

It turns out there was a bug in the core handler that caused this issue which I have now fixed in the other branch. If you merge the branch (as mentioned above) it should fix the issue.... however you will also need to delete all your connections and start again as they will be corrupt with incorrect data caused by the bug.

I can confirm that my fix now means the youtube handlers don't save the same item more than once.

Handlers save items only once now.