Eugene-Mark / bigdata-file-viewer

A cross-platform (Windows, MAC, Linux) desktop application to view common bigdata binary format like Parquet, ORC, AVRO, etc. Support local file system, HDFS, AWS S3, Azure Blob Storage ,etc.
GNU General Public License v2.0
282 stars 54 forks source link

Needs handling of windows 8.3 dos short paths #43

Closed sadaaithal closed 3 months ago

sadaaithal commented 3 months ago

When handling valid AVRO files which resides deep inside a nested filesystem tree, Windows OS returns 8.3 DOS short names for a file that is chosen through the file picker. So a file becomes C:\Work\Tools\cdgc\SADA_O~1\SADA_O~1\INGEST~1\8B6EC9~1\content\data\RELATI~1\COMINF~1.DAT\BACF6F~1.AVR

but its actual filename is something like c:\Work\Tools\cdgc\<long path here>\bacf6f.something.avro

This causes the default data parser to be chosen (parquet) which then leads to java.lang.RuntimeException: file:/C:/Work/Tools/cdgc/SADA_O~1/SADA_O~1/INGEST~1/8B6EC9~1/content/data/RELATI~1/COMINF~1.DAT/BACF6F~1.AVR is not a Parquet file. Expected magic number at tail, but found [35, 46, 107, 40]

Ive managed to patch this with a powershell hack as follows:

`

diff --git a/Renderer.java "b/C:\\Work\\Workspaces\\Default\\testcases-jdk11\\src\\org\\eugene\\controller\\Renderer.java"
index 9d840fe..8f1e861 100644
--- a/Renderer.java
+++ "b/C:\\Work\\Workspaces\\Default\\testcases-jdk11\\src\\org\\eugene\\controller\\Renderer.java"
@@ -2,6 +2,8 @@ package org.eugene.controller;

 import javafx.stage.FileChooser;
 import javafx.stage.Stage;
+
+import org.apache.commons.lang3.SystemUtils;
 import org.apache.hadoop.fs.Path;
 import org.eugene.core.common.AWSS3Reader;
 import org.eugene.model.CommonData;
@@ -13,12 +15,16 @@ import org.eugene.ui.Dashboard;
 import org.eugene.ui.Main;
 import org.eugene.ui.Table;

+import java.io.BufferedReader;
 import java.io.File;
+import java.io.InputStreamReader;
+import java.nio.file.Paths;
 import java.util.ArrayList;
 import java.util.List;
 import java.util.Map;
 import java.util.regex.Matcher;
 import java.util.regex.Pattern;
+import java.util.stream.Collectors;

 public class Renderer {

@@ -85,11 +91,43 @@ public class Renderer {
             System.out.println("The location is empty");
         }
         File selectedFile = filechooser.showOpenDialog(stage);
-        String absolutePath = selectedFile.getAbsolutePath();
+        String absolutePath = resolveWindowsShortPath(selectedFile.getAbsolutePath());
         PhysicalDB.getInstance().updateLocation(absolutePath);
         Path path = new Path(absolutePath);
         return load(path);
     }
+
+    String resolveWindowsShortPath(String input) {
+               if (SystemUtils.IS_OS_WINDOWS) {
+                       if (Paths.get(input).toAbsolutePath().toString().contains("~")) {
+                               java.nio.file.Path inputPath = Paths.get(input);
+                               String psDosToLongPathCmdFmt = "powershell \"(Get-Item -LiteralPath '" + inputPath.toAbsolutePath().toString() +"').FullName\"";
+                               System.out.println("sada-fix: dos 8.3 input path: " + input);
+                               System.out.println("sada-fix: running conversion cmd: " + psDosToLongPathCmdFmt);
+                               try {
+                                       String output = "";
+                                       int exitVal = 0;
+                                       try {
+                                               Process proc = Runtime.getRuntime().exec(psDosToLongPathCmdFmt);
+                                               output = new BufferedReader(new InputStreamReader(proc.getInputStream()))
+                                                               .lines().collect(Collectors.joining(System.lineSeparator()));
+                                               proc.waitFor();
+                                               exitVal = proc.exitValue();
+                                       } catch(Exception e) {
+                                               e.printStackTrace();
+                                               System.out.println("sada-fix: powershell cmd failed. Exit val " + exitVal);
+                                               System.out.println("sada-fix: revert to default path");
+                                               output = input;
+                                       }
+                                       System.out.println("sada-fix: output: " + output);
+                                       return output;
+                               } catch (Exception e) {
+                                       return input;
+                               }
+                       }
+               }
+               return input;
+       }

     private String getDirectory(String fullPath){
         String regex = "(.*)[\\\\][.]*";

`

Eugene-Mark commented 3 months ago

Thanks for reporting the issue and contribute the patch, let me have a quick review about it.

sadaaithal commented 3 months ago

fix committed