antonisdim / haystac

Code repository for the HAYSTAC pipeline
MIT License
12 stars 4 forks source link

Cache error- unable to run analyse rule #4

Closed Pkaps25 closed 3 years ago

Pkaps25 commented 3 years ago

Hello,

I am able to build databases and create samples, but the haystac analyse command is failing with the following error:

haystac: error: You are trying to set a value for parameter cache on top of an already existing one (old: /local/workdir/pk445/haystac/haystac_cache_example, new: /home/pk445/haystac/cache). Please either revert to the original parameter you used or create a new output directory.

I have tried the following:

  1. haystac config --clear-cache
  2. Deleting the new and old cache directories
  3. Setting a new cache directory with haystac config
  4. Changing the output directory of the analyse rule

However, the error still persists and I can't seem to diagnose the source. Thank you for your help.

Best, Peter

antonisdim commented 3 years ago

Hello Peter,

I hope you are doing great and thank you for your comments!

There is a validation process in haystac that checks that the same cache directory is being used throughout the analysis - from building a database to analysing a sample. From my understanding you built a database using a different cache directory path to the cache directory path haystac analyse is provided with (please let me know if I have not understood this correctly).

Regarding the things you have tried:

  1. haystac config –clear-cache just deletes all the contents of the cache directory, but it doesn’t change any of the provided settings for haystac config –cache <dir path>
  2. Deleting the old and the new cache directories also does not affect the settings for haystac config –cache <dir path>
  3. Setting a new cache directory path with haystac config –cache <dir path> will work but you need to use the same cache path throughout your analysis (from haystac database to haystac analyse).
  4. If your output directories for haystac database, haystac sample or haystac analyse are the same as the ones used in a previous round of analyses then that could be the source of why the error is still persisting. Each output directory contains a yaml file that is being used by haystac for validation purposes. haystac analyse ensures that there are no discrepancies among user specified options (including the cache path) across haystac database, haystac sample and itself, before going through with an analysis (thus old values of the cache path stored in yaml files in old output directories, might be clashing with the new path value you have specified).

After setting your new cache path, please try specifying new output directory paths to haystac database, haystac sample and haystac analyse, so that the same cache dir path is used by all the modules of haystac. If the problem still persists please let me know.

Please do let me know if I have missed something or if anything is unclear. I will also update the documentation to clarify how haystac validates user specified parameters.

Thank you for your comments and please let me know if this works or if you need more information!

Best, Antony

Pkaps25 commented 3 years ago

Hello Antony,

Thank you for your comment. This explanation is very clear and resolved the issue.

Peter