onnela-lab / mano

Mano - Beiwe research platform API
BSD 3-Clause "New" or "Revised" License
3 stars 3 forks source link

backfill FileNotFoundError .backfill #12

Closed martakarass closed 2 years ago

martakarass commented 2 years ago

I get FileNotFoundError error from backfill function:

FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/sk/c14blb7s493_7lgzdq0m3sjm0000gn/T/tmpt163bi70/<user ID removed for security>/.backfill'

I do not think I am supposed to have the .../.backfill file created before running the backfill function, hence I think it might be a bug.

Below, I provide an example that is reproducible except (a) the Keyring setup is my local-machine specific; (b) I use a Beiwe test study (https://staging.beiwe.org/view_study/68) whose study ID has been determined as appropriate to share in GitHub issue code; as with all Beiwe studies, access authorization is needed to download these data.

import mano
import sys  
import os
import mano.sync as msync
import tempfile
from datetime import date

# set up the keyring
sys.path.insert(0, '/Users/martakaras/Documents/data_beiwe_settings')
import keyring_studies_MK_staging
Keyring = mano.keyring(None)

# define study ID and user ID
study_id = 'QrDrgyGFyH6CmTEOCCnZVw1o'
user_id = list(mano.users(Keyring, study_id))[5]

Show I can download, say, gps data using "traditional" mano approach (here, 51 gps files are downloaded)

# set up temporary dir to download the data to
temp_dir_1 = tempfile.TemporaryDirectory()
output_folder = temp_dir_1.name

# download and extract gps data with msync.download
zf = msync.download(Keyring, study_id, user_id, data_streams = ['gps'])
zf.extractall(output_folder)

# see there is data 
print(len(os.listdir(os.path.join(output_folder, user_id, "gps"))))
51 
# remove the temp directory
temp_dir_1.cleanup()

Show error when an attempt to download data using backfill function

# set up temporary dir to download the data to
temp_dir_1 = tempfile.TemporaryDirectory()
output_folder = temp_dir_1.name

# download and extract gps data with msync.download
start_date = '2021-04-06T00:00:00'
msync.backfill(Keyring, study_id, user_id, output_folder, start_date = start_date,  data_streams = ['gps'])
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
/var/folders/sk/c14blb7s493_7lgzdq0m3sjm0000gn/T/ipykernel_2349/260581604.py in <module>
      5 # download and extract gps data with msync.download
      6 start_date = '2021-04-06T00:00:00'
----> 7 msync.backfill(Keyring, study_id, user_id, output_folder, start_date = start_date,  data_streams = ['gps'])

~/opt/anaconda3/envs/forest_gh/lib/python3.8/site-packages/mano/sync/__init__.py in backfill(Keyring, study_id, user_id, output_dir, start_date, data_streams, lock, passphrase)
     42         backfill_file = os.path.join(output_dir, user_id, '.backfill')
     43         logger.info('reading backfill file %s', backfill_file)
---> 44         with open(backfill_file, 'a+') as fo:
     45             fo.seek(0)
     46             timestamp = fo.read().strip()

FileNotFoundError: [Errno 2] No such file or directory: '/var/folders/sk/c14blb7s493_7lgzdq0m3sjm0000gn/T/tmpt163bi70/<user ID removed for security>/.backfill'
# remove the temp directory
temp_dir_1.cleanup()
biblicabeebli commented 2 years ago

@tokeefe This is Eli, I'm now directly at Onnela Lab, let me know if you want to get in touch over this issue, we probably should discuss the level-of-support that mano gets. Happy to do some maintenance, just never got into the codebase in the past.

tokeefe commented 2 years ago

mano wasn't ensuring that the user_id subdirectory within output_folder exists. Fixed in

pip install --upgrade mano==0.5.2