Open kcondon opened 8 years ago
This sort of thing belongs in the Installation Guide of course, but in my pull request at #2895 I didn't address it at all because I've really only heard about how these various temp directories are used. @landreev is the authority on all this, I believe.
Odum has ~7,300 files piled up in $files.dir/temp and, while I think I remember it being safe to delete them, it would be great to document this for 'fraidy-cats like me.
@Venki18 asked about temp directories at https://groups.google.com/d/msg/dataverse-community/tLi8_Wmx9Ao/byOzzMc8BQAJ . I agree we should document them. I don't personally know where they are.
"java.io.tmpdir" seems to be one of the important settings. It's mentioned in https://github.com/IQSS/dataverse/issues/6656#issuecomment-587056202
See https://github.com/IQSS/dataverse/issues/6656#issuecomment-1057585379 for a comment about a configuration hack involving glassfish-web.xml
for the PrimeFaces temp directory.
A new summary by @qqmyers at https://groups.google.com/g/dataverse-community/c/yR4JVx_Zs8o/m/xR9d_DsVAAAJ
"Hopefully accurate: Ingest is done asynchronously, so files uploaded without direct upload start as a temp file and are then transferred to S3. At some point after that, a copy gets pulled back to the server, ingest is done, and ingested version of the file is uploaded to S3. Due to a quirk in naming in the store, the original file is S3 copied to a new key and the ingested version uses the original key."
To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.
If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.
This is still needed. It was asked about just today: https://dataverse.zulipchat.com/#narrow/stream/375707-community/topic/Temp.20Files/near/464640068
And there's this recent related issue:
Re-opening.
Another question about temp files: https://groups.google.com/g/dataverse-community/c/I3UQXSJFdxU/m/-U0_yBcHCAAJ
We should write some docs.
Currently we use several temp directories for different purposes and while not usually an issue for end users, it may impact system admins as a maintenance issue.
Known temp directories: /tmp Used by ingest and Two Ravens. Two Ravens does not clean up after itself, ingest (mostly) does. Suggested to use a /tmp cleaner cron job but may be able to further clean up after self.
JSP eg. /usr/local/glassfish4/glassfish/domains/domain1/generated/jsp/dataverse-4.2.3 where dataverse-4.2.3 is the name of the deployed application Used by the Prime Faces upload control until dataset is saved. Can oprhan files here if file uploads fail or session is abandoned, etc. Suggested to link this to larger storage area and assign a cleaner cron job.
Others?