htwangtw / adie_ongoingthoughts

ADIE ongoing thought related analysis plan
MIT License
1 stars 2 forks source link

CISC ID to behavioural data ID conversion #6

Closed htwangtw closed 3 years ago

htwangtw commented 3 years ago

Need a converter and encrypt the dictionary for conversion.

htwangtw commented 3 years ago

The conversion reference has been provided by James, who previously worked with the dataset. The conversion table is now stored on the remote computing server.

willstrawson commented 3 years ago

I'll work on this

willstrawson commented 3 years ago

Hey here's what I have so far If you get a chance could you advise on line 57 (shutil.move())? I'm having trouble renaming file and directory at the same time. No worries if you don't have time, I'll crack it soon i'm sure.

# Script to convert CISC ID to ADIE ID in directory and file names 
# This is only tested on directories/files with a BIDS naming convention
# The way the script searches for the sub- ID is dependent on the characters before and after,
# which is specific to BIDS - be aware  

import pandas as pd
import os 
import re
import shutil
import sys, os

# Construct path of converter .txt file such that it's relative to the user running the script
# Get parent directory of this script       
pathname = os.path.dirname(sys.argv[0])  
# Get the level above that (i.e. the critchley_adie project dir)      
project_path = os.path.split(pathname)[0]
print('project_path=',project_path)
#Import adie/cisc conversion txt file and store as dataframe
txt=(project_path+"/BIDS_data/sourcedata/adie_idconvert.txt")

# convert this to dictonary, where key = CISC and val = ADIE 
rename = {}
with open(txt) as f:
    for line in f:
        (key,val)=line.split()
        rename[str(key)] = str(val)

# USE OS.WALK

def subconvert(p):
# P should = the path of the directory above the sub- dirs 
    for root,dirs,files in os.walk(p):
        for f in files:
            # Extract sub- number for searching 
            # Extract number after 'sub-' and before '_'
            try:
                srch = re.search('sub-(.+?)_',f)
                # extract just ID number
                cidn = srch.group(1)
                # Add CISC to ID to enable search 
                cid = 'CISC'+str(cidn)
                # If this file contrains the CISC ID...

                if cid in rename.keys():
                    #print (cid,rename[cid])
                    try:
                        # ... replace the CISC ID with ADIE ID 
                        # add root to filename and then replace 
                        fullf = os.path.join(root,f)
                        newf = fullf.replace(cidn,rename[cid])
                        print("Renaming",fullf,"to",newf)

                        # Rename file and directory 
                        # !! getting "no such file found" error 
                        # I'm to create new file name and  directory name that doesn't yet exist...
                        # TODO: figure a way of renaming both levels at the same time OR create directory first 
                        shutil.move(os.path.join(root,f), os.path.join(root,newf))

                    # If error occured with .replace or .move, print error
                    except Exception as e:
                        print(e)

            # If error occured with re.search, print error 
            except Exception as e:
                    print(e)

            print('\n')
        print('-'*100)   

#test path
path = "/Volumes/cisc2/projects/critchley_adie/wills_data/bids/bids_data2/sub-23014/"

subconvert(path)
htwangtw commented 3 years ago

May I suggest:

  1. Accept the pull request I send on your fork
  2. Create a new branch named convert_id from the updated main
  3. Add this script to bin/convert_id.py
  4. Create a drafted PR to this repository We can ignore your other PR #3
willstrawson commented 3 years ago

Yes good idea, thanks.