OHDSI / WhiteRabbit

WhiteRabbit is a small application that can be used to analyse the structure and contents of a database as preparation for designing an ETL. It comes with RabbitInAHat, an application for interactive design of an ETL to the OMOP Common Data Model with the help of the the scan report generated by White Rabbit.
http://ohdsi.github.io/WhiteRabbit
Apache License 2.0
174 stars 85 forks source link

Permission denied error writing scan results #293

Open howff opened 3 years ago

howff commented 3 years ago

After running for 9 hours WhiteRabbit crashed with a "Permission denied" error after printing "Generating scan report"

And it doesn't say which filename it was trying to write that failed

./whiteRabbit -ini ini 10:40:06 Started new scan 10:40:06 Scanning table one Stopped after 1000 rows etc. (Some tables take 3 hours despite only loading 1000 rows !!!) 21:14:10 Generating scan report Exception in thread "main" java.lang.RuntimeException: java.io.IOException: Permission denied at org.apache.poi.xssf.streaming.SXSSFWorkbook.createAndRegisterSXSSFSheet(SXSSFWorkbook.java:662) at org.apache.poi.xssf.streaming.SXSSFWorkbook.createSheet(SXSSFWorkbook.java:679) at org.ohdsi.whiteRabbit.scan.SourceDataScan.createFieldOverviewSheet(SourceDataScan.java:201) at org.ohdsi.whiteRabbit.scan.SourceDataScan.generateReport(SourceDataScan.java:182) at org.ohdsi.whiteRabbit.scan.SourceDataScan.process(SourceDataScan.java:117) at org.ohdsi.whiteRabbit.WhiteRabbitMain.launchCommandLine(WhiteRabbitMain.java:268) at org.ohdsi.whiteRabbit.WhiteRabbitMain.(WhiteRabbitMain.java:126) at org.ohdsi.whiteRabbit.WhiteRabbitMain.main(WhiteRabbitMain.java:121) Caused by: java.io.IOException: Permission denied at java.io.UnixFileSystem.createFileExclusively(Native Method) at java.io.File.createTempFile(File.java:2024) at org.apache.poi.util.DefaultTempFileCreationStrategy.createTempFile(DefaultTempFileCreationStrategy.java:110) at org.apache.poi.util.TempFile.createTempFile(TempFile.java:66) at org.apache.poi.xssf.streaming.SheetDataWriter.createTempFile(SheetDataWriter.java:87) at org.apache.poi.xssf.streaming.SheetDataWriter.(SheetDataWriter.java:70) at org.apache.poi.xssf.streaming.SheetDataWriter.(SheetDataWriter.java:75) at org.apache.poi.xssf.streaming.SXSSFWorkbook.createSheetDataWriter(SXSSFWorkbook.java:330) at org.apache.poi.xssf.streaming.SXSSFSheet.(SXSSFSheet.java:80) at org.apache.poi.xssf.streaming.SXSSFWorkbook.createAndRegisterSXSSFSheet(SXSSFWorkbook.java:658) ... 7 more

My ini file is like this: WORKING_FOLDER = /home/myusername/WhiteRabbit_v0.10.3 DATA_TYPE = PostgreSQL SERVER_LOCATION = 127.0.0.1/mydbname USER_NAME = myusername PASSWORD = mypassword DATABASE_NAME = mydbname DELIMITER = , TABLES_TO_SCAN = * SCAN_FIELD_VALUES = yes MIN_CELL_COUNT = 5 MAX_DISTINCT_VALUES = 1000 ROWS_PER_TABLE = 1000 CALCULATE_NUMERIC_STATS = yes NUMERIC_STATS_SAMPLER_SIZE = 500

MaximMoinat commented 3 years ago

Hi @howff, apparently WR is not allowed to write to the given working folder. Could you try running WR as a super user or change the permissions to the folder?

To speed up the scan, you can disable SCAN_FIELD_VALUES values and/or disable CALCULATE_NUMERIC_STATS by setting them to 'no'.

howff commented 3 years ago

There's nothing wrong with the permissions on the WORKING_FOLDER.

I had to use strace to find the problem - WhiteRabbit tries to create /tmp/poifiles/poi-sxssf-sheetblah.xml but the /tmp/poifiles directory has insufficient permissions when multiple users run it.

MaximMoinat commented 3 years ago

Thanks for looking into the issue further. I was not aware that WR tries to write to a tmp folder. For now, I don't have a solution, but will take a look whether this behaviour can be changed in a future release.

blootsvoets commented 3 years ago

@MaximMoinat you could use this solution to force Apache POI to use a specific tmpdir: https://stackoverflow.com/a/35453124/574082

File dir = new File("somepath");
dir.mkdir();
org.apache.poi.util.TempFile.setTempFileCreationStrategy(new DefaultTempFileCreationStrategy(dir));
MaximMoinat commented 3 years ago

Thanks, I will check it out and create a fix if needed.

On Tue, 23 Mar 2021 at 13:54, Joris Borgdorff @.***> wrote:

@MaximMoinat https://github.com/MaximMoinat you could use this solution to force Apache POI to use a specific tmpdir: https://stackoverflow.com/a/35453124/574082

File dir = new File("somepath"); dir.mkdir();org.apache.poi.util.TempFile.setTempFileCreationStrategy(new DefaultTempFileCreationStrategy(dir));

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/OHDSI/WhiteRabbit/issues/293#issuecomment-804879131, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEH767EG5JETJYGGGBZQ5UTTFCFPJANCNFSM4ZKGYYLQ .

tom-dyar commented 2 years ago

I do think it would be great that as soon as ANY file is known to be needed, at that point check to make sure the path is valid, and alert the user before a significant amount of time elapses -- I just ran into this with a misconfigured path to the output in my .ini file!

howff commented 1 year ago

Hi This is still a blocker preventing multiple users from using WhiteRabbit!

janblom commented 1 year ago

I have uploaded a new version for testing: v0.10.9-test. Please see the explanation there.

It would be highly appreciated if someone can test this, and report success, or details of failure.