CINPLA / expipe-plugin-cinpla

Plugins for expipe command line interface
http://expipe-plugin-cinpla.readthedocs.io/en/latest/
GNU General Public License v3.0
1 stars 3 forks source link

Windows/posix action data path #29

Closed espenhgn closed 4 months ago

espenhgn commented 5 years ago

Hi, Turns out that datasets that are registered/preprocessed on Windows have hardcoded windows paths, such that the action cannot be reprocessed/worked on without workarounds on macos/linux. Havent looked into why:

ipdb> action                                                                                                                                                                        
<expipe.core.Action object at 0x11d335fd0>
ipdb> action.data['main']                                                                                                                                                           
'actions\\1012-090119-01\\data\\main.exdir'
alejoe91 commented 5 years ago

It should be possible to add a check with pathlib to see if it's a WindowsPath. In case it is instantiate a WindowsPath object and parse it into a PosixPath one.

Where does the error come from?

espenhgn commented 5 years ago

Hi. I'm on branch https://github.com/espenhgn/expipe-plugin-cinpla/tree/dev_cobra_merge, but don't think it matter. Full traceback:

from expipe_plugin_cinpla.imports import PAR
import os
action_id = '1012-090119-01'
project_id = PAR.PROJECT_ID
project = PAR.PROJECT
action = project.actions[action_id]
data_path = action.data['main']
print(data_path) # 'actions\\1012-090119-01\\data\\main.exdir'
os.path.isdir(os.path.join('..', 'actions','1012-090119-01','data','main.exdir')) # True
espenhgn commented 5 years ago

Btw., an error may arise several places, dependent on how data_path is used. Path information must be written in a platform-independent format.

alejoe91 commented 5 years ago

How would you write a path in platform independent format? In my opinion it would be probably easier to have a get_path() function that deals with it.

espenhgn commented 5 years ago

How would you write a path in platform independent format?

As a tuple of strings in the attributes.yaml file (granted that yaml allows for it): ('actions', <action ID>, 'data', 'main.exdir'). That should be system independent. Then it should be straightforward to utilize os.path.join or the pathlib equivalent to get a valid path on the OS working on the data.

In my opinion it would be probably easier to have a get_path() function that deals with it.

While I agree it's easy to work around, I disagree that we should hard code path information in a system-specific way. After all, we want to be able to distribute our data in between different computers for processing, sharing, whatever. The way it is now, there is inconsistent yaml output dependent on the OS used to register the data and this shouldn't happen.

espenhgn commented 5 years ago

If we have to choose, we should use the Posix path type. Windows can deal with those (source: https://medium.com/@ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f)

espenhgn commented 5 years ago

Possible workaround (should work with Posix style paths still):

diff --git a/expipe_plugin_cinpla/scripts/utils.py b/expipe_plugin_cinpla/scripts/utils.py
index 8483aba..6f07203 100644
--- a/expipe_plugin_cinpla/scripts/utils.py
+++ b/expipe_plugin_cinpla/scripts/utils.py
@@ -145,7 +145,7 @@ def _make_data_path(action, overwrite):
 def _get_data_path(action):
     action_path = action._backend.path
     project_path = action_path.parent.parent
-    data_path = action.data['main']
+    data_path = str(pathlib.Path(pathlib.PureWindowsPath(action.data['main'])))
     return project_path / data_path
alejoe91 commented 5 years ago

added the suggested fix in the latest commit ec4f1f11472ed8b8d2e2a979ff83f189b8a4ef81

espenhgn commented 5 years ago

Hi, sorry for not responding (too busy at the winter school). Did you include a fix downstream of that commit to write PosixPath type when registering datasets?