MiraldiLab / maxATAC

Transcription Factor Binding Prediction from ATAC-seq and scATAC-seq with Deep Neural Networks
Apache License 2.0
25 stars 8 forks source link

Change Path for #130

Closed Elfaba closed 1 week ago

Elfaba commented 4 months ago

Hi,

I was wondering what I need to adjust and to which constants.py I need make the adjustment to change the location of reference data?

I am working on a HPC and don't use $HOME as it has minimal storage and computing space --> error: Disk quota exceeded

Thank you.

find . -type f -name "constants.py"

./lib/python3.9/asyncio/constants.py ./lib/python3.9/site-packages/numpy/doc/constants.py ./lib/python3.9/site-packages/scipy/constants/constants.py ./lib/python3.9/site-packages/tensorboard/_vendor/html5lib/constants.py ./lib/python3.9/site-packages/keras/saving/saved_model/constants.py ./lib/python3.9/site-packages/maxatac/utilities/constants.py ./lib/python3.9/site-packages/tensorflow/python/saved_model/constants.py ./lib/python3.9/site-packages/tensorflow/python/keras/saving/saved_model/constants.py ./lib/python3.9/site-packages/tensorflow/python/trackable/constants.py

The easiest option is to use the command maxatac data to download the data to the required directory. The maxatac data function will download the maxATAC_data repo and reference data into your ~/opt/ directory under ~/opt/maxatac. Only the hg38 reference genome has been extensively tested.

Using custom reference data The directory ~/opt/maxatac/data is the default location where maxATAC will look for the maxATAC models, hg38 reference annotations, etc.

If you want to use your own references (e.g., hg19) or models, set the appropriate flags for each file with the path to your custom files. You can also adjust the relative paths in constants.py to be the default values for all functions.

ANRudrapatna commented 4 months ago

Hi! Thanks for reaching out, and apologies for the delay in responding!

To download the reference data to a directory other than the default directory, you can use two steps: 1) Set "base_dir" in line 16 of the data.py script (within the analyses folder) to the path of your desired directory and then run maxatac data to download the data into that directory. 2) Modify line 23 of the constants.py script by setting "maxatac_data_path" to os.path.join(dir, "maxatac", "data"), where "dir" is the path to the directory where you installed the reference data. This provides the location of scripts required to run maxATAC prepare.

Hope this helps! Please let us know if you continue to have issues or have any other questions!

Akshata