IQSS / dataverse

Open source research data repository software
http://dataverse.org
Other
877 stars 485 forks source link

Documentation: Document various temp directories used by Dataverse #2848

Open kcondon opened 8 years ago

kcondon commented 8 years ago

Currently we use several temp directories for different purposes and while not usually an issue for end users, it may impact system admins as a maintenance issue.

Known temp directories: /tmp Used by ingest and Two Ravens. Two Ravens does not clean up after itself, ingest (mostly) does. Suggested to use a /tmp cleaner cron job but may be able to further clean up after self.

JSP eg. /usr/local/glassfish4/glassfish/domains/domain1/generated/jsp/dataverse-4.2.3 where dataverse-4.2.3 is the name of the deployed application Used by the Prime Faces upload control until dataset is saved. Can oprhan files here if file uploads fail or session is abandoned, etc. Suggested to link this to larger storage area and assign a cleaner cron job.

Others?

pdurbin commented 8 years ago

This sort of thing belongs in the Installation Guide of course, but in my pull request at #2895 I didn't address it at all because I've really only heard about how these various temp directories are used. @landreev is the authority on all this, I believe.

donsizemore commented 7 years ago

Odum has ~7,300 files piled up in $files.dir/temp and, while I think I remember it being safe to delete them, it would be great to document this for 'fraidy-cats like me.

pdurbin commented 6 years ago

@Venki18 asked about temp directories at https://groups.google.com/d/msg/dataverse-community/tLi8_Wmx9Ao/byOzzMc8BQAJ . I agree we should document them. I don't personally know where they are.

pdurbin commented 4 years ago

"java.io.tmpdir" seems to be one of the important settings. It's mentioned in https://github.com/IQSS/dataverse/issues/6656#issuecomment-587056202

pdurbin commented 2 years ago

See https://github.com/IQSS/dataverse/issues/6656#issuecomment-1057585379 for a comment about a configuration hack involving glassfish-web.xml for the PrimeFaces temp directory.

pdurbin commented 10 months ago

A new summary by @qqmyers at https://groups.google.com/g/dataverse-community/c/yR4JVx_Zs8o/m/xR9d_DsVAAAJ

"Hopefully accurate: Ingest is done asynchronously, so files uploaded without direct upload start as a temp file and are then transferred to S3. At some point after that, a copy gets pulled back to the server, ingest is done, and ingested version of the file is uploaded to S3. Due to a quirk in naming in the store, the original file is S3 copied to a new key and the ingested version uses the original key."

cmbz commented 1 month ago

To focus on the most important features and bugs, we are closing issues created before 2020 (version 5.0) that are not new feature requests with the label 'Type: Feature'.

If you created this issue and you feel the team should revisit this decision, please reopen the issue and leave a comment.

pdurbin commented 1 month ago

This is still needed. It was asked about just today: https://dataverse.zulipchat.com/#narrow/stream/375707-community/topic/Temp.20Files/near/464640068

And there's this recent related issue:

Re-opening.