kundajelab / atac_dnase_pipelines

ATAC-seq and DNase-seq processing pipeline
BSD 3-Clause "New" or "Revised" License
160 stars 81 forks source link

Install genome faidx issue #63

Closed rbronste closed 6 years ago

rbronste commented 6 years ago

Hi,

In the install genome code I had to modify faidx to samtools faidx for it to load, however getting following error in regards to faidx options:

Original code:

## extract fasta per chromosome
cd ${DATA_DIR}/$GENOME
mkdir -p seq
cd seq
rm -f ${REF_FA_PREFIX}
ln -s ../${REF_FA_PREFIX} ${REF_FA_PREFIX}
samtools faidx -x ${REF_FA_PREFIX}
cp --remove-destination *.fai ../
2017-08-03 11:52:54 (167 MB/s) - “mm10_dnase_avg_fseq_signal_metadata.txt” saved [1251/1251]

Extracting/processing data files...
faidx: invalid option -- 'x'

Usage:   samtools faidx <file.fa|file.fa.gz> [<reg> [...]]
rbronste commented 6 years ago

It seems as though -x is not a valid samtools faidx flag.

leepc12 commented 6 years ago

Faidx in the installer is not samtools faidx but pyfaidx. Can you post the installer error you got before you modify the code? and please check out the following for debugging.

$ source activate bds_atac
$ which faidx
$ source deactivate

Thanks,

Jin

On Thu, Aug 3, 2017 at 10:46 AM, rbronste notifications@github.com wrote:

It seems as though -x is not a valid samtools faidx flag.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/kundajelab/atac_dnase_pipelines/issues/63#issuecomment-320041195, or mute the thread https://github.com/notifications/unsubscribe-auth/AIOd_Nb-OgDEguhCjYvf7de32ZOccm50ks5sUgeHgaJpZM4Osrnc .

rbronste commented 6 years ago

Hey Jin,

Ok now I understand my mistake should have activated the atac_bds module before this step, and not just before running the pipeline. I changed the code to samtools faidx and removed -x and it worked however now I know why. The installer error is the one I posted above, but Im assuming its because I did not do the following:

source activate bds_atac

Also the debugging indeed showed that it is using the python version after activating the miniconda module. Thanks.

leepc12 commented 6 years ago

I just wonder why you failed before modifying the installer script because source activate bds_atac already exists in installer_genome_database.sh (https://github.com/kundajelab/atac_dnase_pipelines/blob/master/install_genome_data.sh#L164).

You should not activate any pipeline-related conda env. like bds_atac before running pipelines. You will probably see java error. The pipeline automatically/selectively loads conda envs (bds_atac and bds_atac_py3) for each subtask. So please make sure to deactivate bds_atac and bds_atac_py3 before running pipelines.