cecobask / imdb-trakt-sync

Automatic sync from IMDb to Trakt (watchlist, lists, ratings and history) using GitHub actions.
MIT License
97 stars 229 forks source link

Syncing error. #62

Closed aktur closed 2 months ago

aktur commented 2 months ago

Hi, since few days I observe this error

image
cecobask commented 2 months ago

Hi @aktur, could you please set the IMDB_TRACE repository secret to true so I can get more details?

aktur commented 2 months ago
image

...

image
cecobask commented 2 months ago

It looks like the same issue I currently face when running the sync via GitHub Actions. The interesting part is that everything works normally when I run it locally and I can't reproduce it. Could you please try running it locally and let me know if it works?

paviem commented 2 months ago

Hi. I can confirm - it works fine when run locally.

cecobask commented 2 months ago

Thanks for confirming, @paviem! This is quite a strange issue... It might have something to do with the GitHub runners that the sync workflow gets assigned. A potential workaround could be to set up self-hosted runner/s and experiment there. I'm open to suggestions if somebody has a workaround.

aktur commented 2 months ago

Yes, indeed, locally it works normally.

cecobask commented 2 months ago

I'm thinking about using AWS Lambda to run part of the sync workflow. Users will need to set up an AWS account for the required cloud infrastructure. The workflow can be automated via Terraform. I can start working on the solution sometime next week.

The AWS free tier will cover most use cases, but even without it, the costs would be minimal. Using the pricing calculator, I got the following estimate:

Unit conversions

Number of requests: 2 per day * (730 hours in a month / 24 hours in a day) = 60.83 per month
Amount of memory allocated: 128 MB x 0.0009765625 GB in a MB = 0.125 GB
Amount of ephemeral storage allocated: 512 MB x 0.0009765625 GB in a MB = 0.5 GB

---

Pricing calculations

60.83 requests x 300,000 ms x 0.001 ms to sec conversion factor = 18,249.00 total compute (seconds)
0.125 GB x 18,249.00 seconds = 2,281.13 total compute (GB-s)
Tiered price for: 2,281.13 GB-s
2,281.13 GB-s x 0.0000166667 USD = 0.04 USD
Total tier cost = 0.038 USD (monthly compute charges)
Monthly compute charges: 0.04 USD
60.83 requests x 0.0000002 USD = 0.00 USD (monthly request charges)
Monthly request charges: 0.00 USD
0.50 GB - 0.5 GB (no additional charge) = 0.00 GB billable ephemeral storage per function
Monthly ephemeral storage charges: 0 USD
Lambda costs - Without Free Tier (monthly): 0.04 USD

If someone has better and/or more cost-effective alternatives, please let me know.

aktur commented 2 months ago

That works for me; I already have an AWS account. If you need assistance, I can help with CloudFormation or CDK, so no third-party tools would be necessary for the infrastructure. However, could you clarify why running it inside Lambda would be beneficial? Are you concerned that the GitHub runner might not be able to handle the job?

cecobask commented 2 months ago

I proposed AWS Lambda because the sync workflow frequency is by default every 12 hours. Its usage will be based on demand. The duration of the workflow (happy path) for an average user with 10 lists should be around 5 minutes. There’s also the alternative of using EC2 spot instances.

I have a feeling the compute capacity or network bandwidth of the GitHub runners is not sufficient for the sync workflow. It frequently timeouts at random parts of the source code, while the same never timeout locally. I’ll have to setup a test Lambda to verify that concept.

CheraHamza commented 2 months ago

@cecobask I think it has to do with some problem in Trakt authorization.

image

cecobask commented 2 months ago

Hi @CheraHamza, this is considered normal behaviour for the application. The log informs you that the rate limit has been reached and that it will retry the request. Usually, the second attempt after such an event is successful.

mxkimball commented 2 months ago

@cecobask Is there any workaround for this error? I haven't been able to successfully sync for about two weeks. Is it working for anyone?

cecobask commented 2 months ago

Hi @mxkimball, if you run the sync locally on your machine it works perfectly fine. I’m still investigating a potential solution for the GitHub Actions workflow.

cecobask commented 2 months ago

The issue has been resolved. Please sync your forks and it should work again 👍

aktur commented 2 months ago

Great! So, what exactly was off?

cecobask commented 2 months ago

Great! So, what exactly was off?

Something strange was happening to the browser session (cookies) after the closure of the first tab. The app was spawning new tabs for most of the activities within IMDb. I couldn’t get to reproduce the issue locally. After some trial/error and debugging, I found out the authentication was being removed from the session and have no idea what caused it. I fixed it by reusing a single tab for all activities.