nservant / HiC-Pro

HiC-Pro: An optimized and flexible pipeline for Hi-C data processing
Other
382 stars 183 forks source link

find: File system loop detected #464

Closed think-o closed 3 years ago

think-o commented 3 years ago

I get this error when I run HiC-Pro as follows:

/usr/local/bin/HiC-Pro_3.0.0/bin/HiC-Pro -i analysis_hi_c_data/results/data/ -o results/data/hicPro_out -c analysis_hi_c_data/genome_files/config-hicpro.txt -p find: File system loop detected; ‘rawdata/hicPro_out/rawdata’ is part of the same file system loop as ‘rawdata’.

The directory structure was followed as suggested: data/sample1/trim1_R1.fastq data/sample1/trim2_R2.fastq

Before this I got an error related to updating my PATH with ice. Error: The 'ice' command is not in your path. Please check where the 'iced' python package has been installed and update your PATH ! I updated the PATH with the version provided in HiC-Pro_3.0.0/scripts export PATH="/usr/local/bin/HiC-Pro_3.0.0/scripts/ice/:$PATH" in the ~/.bashrc file.

This is the config-install.txt file I used for installation

#########################################################################
## Paths and Settings  - Start editing here !
#########################################################################

PREFIX = 
BOWTIE2_PATH =
SAMTOOLS_PATH = /home/nipgr/software/samtools-1.13
R_PATH =
PYTHON_PATH = /home/nipgr/software/Python-3.9.6
CLUSTER_SYS =

Though both the 'ice' and 'iced' python modules were present in the Python-3.9.6 library I got to update the PATH. I do not if these two are related. But I get this error when I try to load the version of ice from the scripts directory.

[bioinfo@localhost HiC-Pro_3.0.0]$ ./scripts/ice --version
Traceback (most recent call last):
  File "./ice", line 6, in <module>
    from scipy import sparse
ImportError: No module named scipy
I have installed all the packages.cd 

All the packages were installed with pip3.9 available with Python3.9.6. The same version was provided to the configuration file for the tool installation.

This is the "config-hicpro.txt" I have used

# Please change the variable settings below if necessary

#########################################################################
## Paths and Settings  - Do not edit !
#########################################################################

TMP_DIR = tmp
LOGS_DIR = logs
BOWTIE2_OUTPUT_DIR = bowtie_results
MAPC_OUTPUT = hic_results
RAW_DIR = rawdata

#######################################################################
## SYSTEM AND SCHEDULER - Start Editing Here !!
#######################################################################
N_CPU = 55
SORT_RAM = 200G
LOGFILE = hicpro.log

JOB_NAME = 
JOB_MEM = 
JOB_WALLTIME = 
JOB_QUEUE = 
JOB_MAIL = 

#########################################################################
## Data
#########################################################################

PAIR1_EXT = _R1
PAIR2_EXT = _R2

#######################################################################
## Alignment options
#######################################################################

MIN_MAPQ = 10

BOWTIE2_IDX_PATH = analysis_hi_c_data/genome_files/index_files_HiC_Pro
BOWTIE2_GLOBAL_OPTIONS = --very-sensitive -L 30 --score-min L,-0.6,-0.2 --end-to-end --reorder
BOWTIE2_LOCAL_OPTIONS =  --very-sensitive -L 20 --score-min L,-0.6,-0.2 --end-to-end --reorder

#######################################################################
## Annotation files
#######################################################################

REFERENCE_GENOME = analysis_hi_c_data/genome_files/Arabidopsis_thaliana.TAIR10.dna.toplevel.fa
GENOME_SIZE = analysis_hi_c_data/genome_files/arab_chrom.sizes

#######################################################################
## Allele specific analysis
#######################################################################

ALLELE_SPECIFIC_SNP = 

#######################################################################
## Capture Hi-C analysis
#######################################################################

CAPTURE_TARGET =
REPORT_CAPTURE_REPORTER = 1

#######################################################################
## Digestion Hi-C
#######################################################################

GENOME_FRAGMENT = analysis_hi_c_data/genome_files/arab_hindiii.bed
LIGATION_SITE = AAGCTAGCTT
MIN_FRAG_SIZE = 
MAX_FRAG_SIZE =
MIN_INSERT_SIZE =
MAX_INSERT_SIZE =

#######################################################################
## Hi-C processing
#######################################################################

MIN_CIS_DIST =
GET_ALL_INTERACTION_CLASSES = 1
GET_PROCESS_SAM = 0
RM_SINGLETON = 1
RM_MULTI = 1
RM_DUP = 1

#######################################################################
## Contact Maps
#######################################################################

BIN_SIZE = 20000 40000 150000 500000 1000000
MATRIX_FORMAT = upper

#######################################################################
## Normalization
#######################################################################
MAX_ITER = 100
FILTER_LOW_COUNT_PERC = 0.02
FILTER_HIGH_COUNT_PERC = 0
EPS = 0.1
think-o commented 3 years ago

This error find: File system loop detected; ‘rawdata/hicPro_out/rawdata’ is part of the same file system loop as ‘rawdata’ was due to the output folder results/data/hicPro_out being a subdirectory in the results/data. I have changed it's path to the results. Think it does not work if it's inside rawdata folder.

I am running it on TORQUE. I have started to run with qsub -M HiCProstep1.sh

Can I make the run verbose? How do I get to know the progress of this process? Mine is running for 16 hours. I do not get any log files or intermediate files generated too. I doubt if it's running properly

nservant commented 3 years ago

Hi, When you are running HiC-Pro on your cluster, you should get the logs in the TORQUE output file ! Hope it helps N