Open rowild opened 1 year ago
Further things I did:
PRAGMA integrity_check
I eventually stopped docker, deleted all containers and images and ran docker compose
again, but even that didn't change the problem.
Eventually I set it up locally using the exact same docker-compose.yaml that I use on directus:
version: '3'
services:
cache_jvds:
container_name: cache_jvds
image: redis:6
networks:
- jvds
directus_jvds:
container_name: directus_jvds
image: directus/directus:latest
ports:
- 8056:8055 # external:internal
volumes:
- ./uploads:/directus/uploads
- ./database:/directus/database
- ./extensions:/directus/extensions
networks:
- jvds
depends_on:
- cache_jvds
environment:
KEY: '...'
SECRET: '...'
DB_CLIENT: 'sqlite3'
DB_FILENAME: './database/data-jvds.db'
CACHE_ENABLED: 'true'
CACHE_STORE: 'redis'
CACHE_REDIS: 'redis://cache_jvds:6379'
ADMIN_EMAIL: '...'
ADMIN_PASSWORD: '...'
PUBLIC_URL: '...'
STORAGE_LOCATIONS: 'local'
STORAGE_LOCAL_DRIVER: 'local'
STORAGE_LOCAL_ROOT: './uploads'
networks:
jvds:
name: jvds
external: true
Even here no collection that has been created will be shown even though the data was saved to the database (see video).
https://user-images.githubusercontent.com/213803/230759916-63652675-4202-42d3-8afe-e47385e03dcd.mp4
UPDATE 1: I also tried a local instance using Postgres - the same thing happens. Reverting back to 9.23.4 DOES show the collection.
UPDATE 2: I am experiencing this problem with 9.23.4, 9.23.3 and 9.23.2 as well. Only in 9.23.1 collection creation works as expected.
I am unfortunately unable to reproduce this 😬 Could you try reproducing it with all caches disabled "CACHE_ENABLED"/"CACHE_SCHEMA"/"CACHE_PERMISSIONS" and inspect the actual API call done when trying to apply changes?
We haven't been able to reproduce this based on the given information, so I'll close this for now. Happy to keep discussing in this thread, and we'll reopen once more information becomes available 👍🏻
Thank you for your feedbacks, @br41nslug & @rijkvanzanten! I did what you told me – I am sorry, I was not aware of the CACHE parameter options! Here is what I did:
I tried a new local installation with a new database and that one works indeed fine!
But as soon as I use my existing SQLite file – by replacing the just created, empty new data.db file – the problem as described above remains.
So next I execute docker compose down
, delete all containers and images and make sure to set those CACHE* options you recommend to false in my docker-compose.yml. Eventually docker compose up -d
creates a Directus 9.24 instance with my old database file (the one with all the data), on which I can work now (change fields, create new collections, ...).
Buuuuut ... when I then go back to the yml file and enable all cache options again (and do all the docker compose down and up stuff), I am again facing the same problem as initially described.
I also experimented with CACHE_AUTO_PURGE, but to no avail. I assume changing KEY and SECRET would be counter-productive?
This happens locally as well as on DigitalOcean. And I should mention that I only tried the above things with SQLite, not any other db.
Do you happen to have any other ideas? I'd be happy to provide my database file, if you think that could help. Or my DigitalOcean account.
(I remember that I created my data with Directus 9.11 (or 15?), locally (and not with Docker, but going with the Node version). Then, after 5, 6 months, I tried DigitalOcean, and as soon as the setup worked there, I copied over that old database file. I had to do some work like changing the IDs of the user groups etc, but I eventually could work with it. And I still can do so with v9.23.1, but not any version after that...)
Thank you again for your time!
I think I found the problem. As soon as I add "PUBLIC_URL: 'https://localhost:8057'" (port matches the port mapping 8057:8055
) the infos are still written to the database, but Directus does not display that info anymore. However, it still recognises that a collection already exists should you try to create one that has the same name.
Steps to reproduce:
STEP 1: setup
You can use this file:
version: '3'
services:
cache:
container_name: cache
image: redis:6
networks:
- directus
directus:
container_name: directus
image: directus/directus:latest
ports:
- 8055:8055
volumes:
- ./uploads:/directus/uploads
- ./database:/directus/database
networks:
- directus
depends_on:
- cache
environment:
KEY: '255d861b-5ea1-5996-9aa3-922530ec40b1'
SECRET: '6116487b-cda1-52c2-b5b5-c8022c45e263'
DB_CLIENT: 'sqlite3'
DB_FILENAME: './database/data.db'
ADMIN_EMAIL: 'admin@example.com'
ADMIN_PASSWORD: 'd1r3ctu5'
CACHE_ENABLED: 'true'
CACHE_STORE: 'redis'
CACHE_REDIS: 'redis://cache:6379'
# PUBLIC_URL: 'https://127.0.0.1:8055'
networks:
directus:
docker compose up -d
So far everything should be working. Continue with
STEP 2: break it
docker compose down
docker rm [container]
docker rmi [images]
## to make sure the redis cache is removeddocker-compose.yml
docker compose up --force-recreate --build -d
STEP 3: downgrade to 9.23.1
docker-compose.yml
and replace directus/directus:latest
with directus/directus:9.23.1
docker compose pull directus && docker compose up --force-recreate --build -d
Beginning with Directus:9.23.2 these behaviours are broken.
STEP 4: fixing it again
Unfortunately, simply removing the PUBLIC_URL from the yml file won't fix the problem. Not even stopping and deleting any images in order to get rid of the redis cache. It seems that at this point the db file is corrupt. (Sometimes a process is running that quickly creates and deletes a [data.db].db-journaled
file – not sure what this does. Only completely deleting any cache and using a previously backupped database file will make the system work again. (Or turning of cache as mentioned earlier.)
Only tested with SQLite in my local environment (MacBook M2).
Here is a video with my workflow with STEP 1 and STEP 2 (sorry, didn't record the downgrade): https://dl.rowild.at/directus-not-saving-collections__2023-04-11.mp4
Is my understanding of PUBLIC_URL
wrong?
The PUBLIC_URL
should have no direct effect on the database itself. However it looks like you're trying to use a https://
public_url directly to the docker without a proxy handling the certificates. Keep it at http://
for local development.
Ok, I will keep that in mind. However, it does not explain why 9.23.1 works and everything else afterwards not. Also, on my DigitalOcean instances Nginx is handling the proxies. And I still have the exact same problem. I am quite surprised that I seem to be the only one who can create this problem :-) Can you reproduce the problem with my explanation from above?
The PUBLIC_URL does not influence the DB, since, on inspecting the DB file, data is saved. But reading from the DB again right after saving seems to be the problem...
(Is there any other way to delete the cache aside from deleting and reinstalling redis?)
These 2 steps are not needed after a docker compose down
docker rm [container] # covered by compose down
docker rmi [images] # only forces a redownload of the latest tagged image, data is not stored here
nor are these flags --force-recreate --build
having said that i went to through the steps as described and unfortunately "Step 2" is not breaking for me.
The [data.db].db-journaled
you mention makes me think something may have gone wrong at a filesystem/database level for which my tests on another operating system may not be representative of your specific environment
A quick google turned up this post to do with the new apple M cpu's https://sqlite.org/forum/forumpost/d2432b5dc2
@br41nslug Thank you for your feedback! Very much appreciated! I will check the link you post and try to dig deeper.
I hope I will then also find the reason why my DigitalOcean setup does not work either (at DO there is no M1/M2 problem, everything is Ubuntu there...)
From what I understand I have to docker rm [container]
because compose
only stops, but doesn't delete them. They need to be deleted, though, otherwise docker rmi [images]
won't work.
I do understand that those steps are not needed. Just wanted to be as thorough as possible... Thanks for commenting on them!
I will report again should I find something. Meanwhile thank you very much for your help! :-)
We have been experiencing the very same issue since upgrading from 9.23.1 to 9.24. and 9.25.. We use Postgres and Docker and updates suddenly don't get reflected anymore. Sometimes we get a "permission denied" error when refreshing the view or saving an item.
We tried signing out and in again, creating new admin users and various other things like re-deploying with different env settings.
What has worked today was setting all three cache settings, as mentioned above, to false. Only then were we able to perform item updates without issues. As soon as we flip the cache back on, the issue reappears. Is it possible that some cache permissions have changed since 9.23.1? Can we manually invalidate the whole cache somehow?
Many thanks for your help.
Does the issue persist when enabling both CACHE_SCHEMA
and CACHE_AUTO_PURGE
@stx-chris
I tried again with possible variants of CACHE_ENABLED, CACHE_PERMISSIONS, CACHE_SCHEMA, CACHE_AUTO_PURGE
and found that CACHE_AUTO_PURGE
seems to be the culprit, at least when used with CACHE_STORE=memory
.
Whenever it is disabled (= default), both the view and open document don't reflect the changed value. When it is enabled, save works as expected and view/item are updated accordingly. Interestingly, it only shows this behavior in Docker, but locally (MacOS) it works either way.
Might it have to do with the recent changes of #17763?
@rowild Can you try setting CACHE_AUTO_PURGE
to true
and confirm whether this (temporarily) solves your problem too?
@stx-chris I currently have these settings, and for the moment they seem to work (Ubuntu 22.04, Docker, DigitalOcean)
CACHE_ENABLED: 'true'
CACHE_PERMISSIONS: 'true'
CACHE_SCHEMA: 'true'
CACHE_AUTO_PURGE: 'true'
Great! Can you check whether the issue reoccurs once you set CACHE_AUTO_PURGE
to false
? This would then confirm our mutual observations.
@stx-chris After applying the changes to the docker-compose.yml, how do you restart your project? A simple "docker compose restart" does not clear my cache it seems... So I wonder if my previous result is really true.
I am deploying to Google Cloud which rebuilds the container every time. For this test I would manually delete the docker image and rebuild.
@stx-chris I deleted all containers and images. Upon "docker compose up -d" my previous test is valid.
Doing the whole process again, then setting CACHE_AUTO_PURGE
to false
causes the problem of not reflecting the changes in the interface again. So yes, it seems to be a CACHE_AUTO_PURGE
problem...
@br41nslug I guess the issue has been pinpointed then. Let us know if you need more details. Thanks!
@rijkvanzanten Is it possible to re-open this issue, please?
The lack of cache clearing on restart is a consequence of https://github.com/directus/directus/pull/18238 this was done deliberately for horizontally scaled setups. After reading back, enabling the CACHE_AUTO_PURGE
seems to be the solution. Perhaps CACHE_AUTO_PURGE
should be enabled by default with that change 🤔
Agreed, but I'm afraid there is still a bug hiding in the code in howCACHE_AUTO_PURGE
is treated. Whatever its effect on restart is (whether single instances or horizontally scaled pods are affected), it should not affect the update mechanics once the instance is up and running.
There is no good reason why anybody should wish not to see their recent changes in the UI.
I am also wondering why this issue shows only in Docker environments and not locally. Any ideas?
it should not affect the update mechanics once the instance is up and running.
I disagree as that is the purpose of this variable flag as the name may suggest. When disabled the cache does not get cleared automatically.
There is no good reason why anybody should wish not to see their recent changes in the UI.
The main argument for introducing this i believe was platform performance in production instances where the schema does not change anymore.
I am also wondering why this issue shows only in Docker environments and not locally. Any ideas?
This to me sounds more like something is wrong with the cache locally 😬
Likely the reason why this behavior suddenly changed on your ends is because you were running without schema cache before. After https://github.com/directus/directus/pull/17763 the schema cache seems to be enabled by default thus the suggested solution of making CACHE_AUTO_PURGE
enabled by default returning the "expected" behavior for anyone that was not using cache for schema before.
I used what the documentation suggested. So if the "running without schema cache before" habit changed this should be clearly stated IMO (actually I even would say that this justifies a new version since it is a critical change). "A problem with cache locally": would that also be true for a DigitalOcean installation? (Because I get the error there.)
"A problem with cache locally": would that also be true for a DigitalOcean installation? (Because I get the error there.)
This was a reaction to stx which explicitly stated "it is working as expected locally" while i don't think that is the case for your DO setup.
I used what the documentation suggested. So if the "running without schema cache before" habit changed this should be clearly stated IMO (actually I even would say that this justifies a new version since it is a critical change).
This is indeed probably an issue of default settings on our end. I'll re-open the ticket to correct these defaults
"A problem with cache locally": would that also be true for a DigitalOcean installation? (Because I get the error there.)
This was a reaction to stx which explicitly stated "it is working as expected locally" while i don't think that is the case for your DO setup.
Ah, ok! – I understood stx in the way that all his Docker installations have problems, but the local one (I assumed a Node installation) does NOT have a problem. Now I am confused, because the latter one works as expected – but you say it might be a problem with local cache...
Anyway: thank you for re-opening the issue and having a closer look at it! Very much appreciated! :-)
@rijkvanzanten and @br41nslug Thank you! 💯 And thank you, @stx-chris, for finding the culprit! 👍
@stx-chris @rowild The patch of CACHE_AUTO_PURGE
was reverted because of a performance concern. Because enabling that will purge the cache on any collection change including activity/revisions (like logging in and browsing around the data studio).
We have found the underlying culprit which did turn out to be PUBLIC_URL
as @rowild mentioned but I incorrectly dismissed 😬 This behavior was changed in https://github.com/directus/directus/pull/17642 resulting in this issue when the PUBLIC_URL
is not identical to the url used to approach the app. If these do not match the app will receive cached results instead of bypassing the cache.
The workaround for now is to either set the PUBLIC_URL
to the correct URL or if thats not possible remove the PUBLIC_URL
instead until we can set up a more robust fix for this issue. We'll leave this ticket open in the meantime.
Describe the Bug
After running
docker compose pull [service]
and restarting the app (docker compose down
anddocker compose up -d
, all done on DigitalOcean with SQLite) fields of a collection are not updated anymore.https://user-images.githubusercontent.com/213803/230712051-ae5dfe3d-4ec8-44b0-8e2a-c324249ae49b.mp4
To Reproduce
(The content has not been updated.)
Hosting Strategy
Self-Hosted (Docker Image)