ChangLab / FAST-iCLIP

Fully Automated and Standardized (FAST) iCLIP analysis pipeline.
GNU General Public License v2.0
21 stars 15 forks source link

Can't run example data - error "re-reun with the -g option for a genome file" #33

Open natsampaio opened 6 years ago

natsampaio commented 6 years ago

Hello,

I am trying to implement the FASTiCLIP package to analyse my data. However, I can't seem to get things to work with the example. There are a series of errors and the script aborts after around 15 min. I have set up an environment slightly differently from the instructions as follows:

$ conda create -n FASTiCLIP python=2 $ conda install matplotlib=1.5 pandas=0.18.1 samtools=0.1.18 bedtools=2.22.0 fastx_toolkit=0.0.14 matplotlib-venn=0.11.4 iclipro=0.1.1

Note: Bowtie2 ver 2.1.0 needs to be loaded via a module: $ module load bowtie2/2.1.0

$ pip install git+https://github.com/ChangLab/FAST-iCLIP.git

$ export FASTICLIP_PATH=/t1-data/user/nsampaio/FASTiCLIP

Had to download the configure file to a new directory in my server space: $ mkdir FASTiCLIP $ cd FASTiCLIP $ wget https://raw.githubusercontent.com/ChangLab/FAST-iCLIP/master/configure

Had to also manually download files in /bin, using wget as above, to new FASTiCLIP/bin directory

I ran command as follows (note, had to change from what is on the documentation because it doesn't match the example data now provided):

(FASTiCLIP) [nsampaio@klyn FASTiCLIP]$ fasticlip -i rawdata/A_Human_Test_R1.fastq rawdata/A_Human_Test_R2.fastq --GRCh38 -n MMhur -o results [samopen] SAM header is present: 1 sequences. [samopen] SAM header is present: 1 sequences. [samopen] SAM header is present: 1 sequences. [samopen] SAM header is present: 1 sequences. [samopen] SAM header is present: 1 sequences. [samopen] SAM header is present: 1 sequences. [samopen] SAM header is present: 1 sequences. [samopen] SAM header is present: 1 sequences. [samopen] SAM header is present: 1 sequences. [samopen] SAM header is present: 1 sequences. [samopen] SAM header is present: 625 sequences. [samopen] SAM header is present: 625 sequences. [samopen] no @SQ lines in the header. [sam_read1] missing header? Abort! [samopen] no @SQ lines in the header. [sam_read1] missing header? Abort! ERROR: Database file /t1-data/user/nsampaio/FASTiCLIP/docs/GRCh38/GRCh38_repeatMasker.bed contains chromosome chr1, but the query file does not. Please re-reun with the -g option for a genome file. See documentation for details. ERROR: Database file /t1-data/user/nsampaio/FASTiCLIP/docs/GRCh38/GRCh38_repeatMasker.bed contains chromosome chr1, but the query file does not. Please re-reun with the -g option for a genome file. See documentation for details. ERROR: Database file /t1-data/user/nsampaio/FASTiCLIP/docs/GRCh38/GRCh38_repeatMasker.bed contains chromosome chr1, but the query file does not. Please re-reun with the -g option for a genome file. See documentation for details. ERROR: Database file /t1-data/user/nsampaio/FASTiCLIP/docs/GRCh38/GRCh38_repeatMasker.bed contains chromosome chr1, but the query file does not. Please re-reun with the -g option for a genome file. See documentation for details. Error: Unable to open file /t1-data/user/nsampaio/FASTiCLIP/docs/GRCh38/snoRNA_coordinates_15exp.bed. Exiting. Error: Unable to open file /t1-data/user/nsampaio/FASTiCLIP/docs/GRCh38/snoRNA_coordinates_15exp.bed. Exiting. ERROR: Database file /t1-data/user/nsampaio/FASTiCLIP/docs/GRCh38/miR_sort_clean.bed contains chromosome chr1, but the query file does not. Please re-reun with the -g option for a genome file. See documentation for details. ERROR: Database file /t1-data/user/nsampaio/FASTiCLIP/docs/GRCh38/genes_BED6.bed contains chromosome chr1, but the query file does not. Please re-reun with the -g option for a genome file. See documentation for details. needLargeMem: trying to allocate 0 bytes (limit: 17179869184) needLargeMem: trying to allocate 0 bytes (limit: 17179869184) needLargeMem: trying to allocate 0 bytes (limit: 17179869184) Traceback (most recent call last): File "/t1-data/user/nsampaio/py36-v1/conda-install/envs/FASTiCLIP/bin/fasticlip", line 11, in load_entry_point('fasticlip==0.9.3', 'console_scripts', 'fasticlip')() File "/t1-data/user/nsampaio/py36-v1/conda-install/envs/FASTiCLIP/lib/python2.7/site-packages/fasticlip/fasticlip.py", line 331, in main geneCounts_pc = get_gene_counts(proteinCodingReads_centered) File "/t1-data/user/nsampaio/py36-v1/conda-install/envs/FASTiCLIP/lib/python2.7/site-packages/fasticlip/helper.py", line 570, in get_gene_counts bf=pd.DataFrame(pd.read_table(bedFile,header=None)) File "/t1-data/user/nsampaio/py36-v1/conda-install/envs/FASTiCLIP/lib/python2.7/site-packages/pandas/io/parsers.py", line 562, in parser_f return _read(filepath_or_buffer, kwds) File "/t1-data/user/nsampaio/py36-v1/conda-install/envs/FASTiCLIP/lib/python2.7/site-packages/pandas/io/parsers.py", line 315, in _read parser = TextFileReader(filepath_or_buffer, kwds) File "/t1-data/user/nsampaio/py36-v1/conda-install/envs/FASTiCLIP/lib/python2.7/site-packages/pandas/io/parsers.py", line 645, in init self._make_engine(self.engine) File "/t1-data/user/nsampaio/py36-v1/conda-install/envs/FASTiCLIP/lib/python2.7/site-packages/pandas/io/parsers.py", line 799, in _make_engine self._engine = CParserWrapper(self.f, self.options) File "/t1-data/user/nsampaio/py36-v1/conda-install/envs/FASTiCLIP/lib/python2.7/site-packages/pandas/io/parsers.py", line 1213, in init self._reader = _parser.TextReader(src, **kwds) File "pandas/parser.pyx", line 523, in pandas.parser.TextReader.cinit (pandas/parser.c:5214) pandas.io.common.EmptyDataError: No columns to parse from file

izarvillasante commented 2 years ago

I am having the same issue