3DGenomes / TADbit

TADbit is a complete Python library to deal with all steps to analyze, model and explore 3C-based data. With TADbit the user can map FASTQ files to obtain raw interaction binned matrices (Hi-C like matrices), normalize and correct interaction matrices, identify and compare the so-called Topologically Associating Domains (TADs), build 3D models from the interaction matrices, and finally, extract structural properties from the models. TADbit is complemented by TADkit for visualizing 3D models
GNU General Public License v3.0
100 stars 61 forks source link

tadbit map ValueError #344

Closed eng3001 closed 3 years ago

eng3001 commented 3 years ago

I conda installed TADbit using the directions on your website and have been trying to run tadbit map and have been continuously running into the following error: ValueError: file has no sequences defined (mode='r') - is it SAM/BAM format? Consider opening with check_sq=False

Commands used: tadbit map -w MAP_DIR --fastq s_obliquus_S3HiC_R1_clean.fastq --index assembly.fasta --read 1 --cpus 6 --renz Sau3AI tadbit map -w MAP_DIR --fastq s_obliquus_S3HiC_R2_clean.fastq --index assembly.fasta --read 2 --cpus 6 --renz Sau3AI tadbit map -w DUAL_MAP_DIR --fastq s_obliquus_S3HiC_R2_clean.fastq --fastq2 s_obliquus_S3HiC_R2_clean.fastq --genome GENOME --index assembly.fasta --read 0 --cpus 6 --renz Sau3AI

TADbit output: Writing versions of TADbit and dependencies Generating Hi-C QC plot

Any information on how to get around this error would be extremely helpful. Thank you!

david-castillo commented 3 years ago

Hi,

This is correct

tadbit map -w MAP_DIR --fastq s_obliquus_S3HiC_R1_clean.fastq --index assembly.fasta --read 1 --cpus 6 --renz Sau3AI

However for --index you have to use the gem (or bowtie2, hisat2) index not the fasta. In the case of gem you have to use the gem-indexer command line but every mapper has its own indexer.

Regards

David

eng3001 commented 3 years ago

Thank you for the help! Best, Wyatt

eng3001 commented 3 years ago

I have tried building an index for the genome using bowtie2 and hisat2 using the fallowing commands: hisat2-build --threads 24 assembly.fasta S_obliquus bowtie2-build assembly.fasta Scenedesmus_obliquus

I then tried to feed the index builder output path to tadbit map but continue to get the same error. Tadbit map command: /usr/bin/time -v tadbit map -w MAP_DIR --fastq s_obliquus_S3HiC_R1_clean.fastq --index HiSat2_Index --read 1 --cpus 6 --renz Sau3AI /usr/bin/time -v tadbit map -w MAP_DIR --fastq s_obliquus_S3HiC_R2_clean.fastq --index HiSat2_Index --read 2 --cpus 6 --renz Sau3AI

tadbit map --index seems to be searching for an actual .index file in those directories. I will try to use gem-indexer.

david-castillo commented 3 years ago

I guess HiSat2_Index is the folder, you have to put it this way:

--index HiSat2_Index/S_obliquus

eng3001 commented 3 years ago

Command: tadbit map -w MAP_DIR --fastq s_obliquus_S3HiC_R1_clean.fastq --index HiSat2_Index/S_obliquus --read 1 --cpus 6 --renz Sau3AI

Error:

Traceback (most recent call last):
  File "/home/wyatte/.conda/envs/tadbit/bin/tadbit", line 152, in <module>
    main(sys.argv)
  File "/home/wyatte/.conda/envs/tadbit/bin/tadbit", line 149, in main
    args.func(args)
  File "/home/wyatte/.conda/envs/tadbit/lib/python3.7/site-packages/pytadbit/tools/tadbit_map.py", line 47, in run
    check_options(opts)
  File "/home/wyatte/.conda/envs/tadbit/lib/python3.7/site-packages/pytadbit/tools/tadbit_map.py", line 228, in check_options
    raise IOError('ERROR: index file not found at ' + opts.index)
OSError: ERROR: index file not found at HiSat2_Index/S_obliquus

I still get the following error when trying to pass the index into tadbit map.

fransua commented 3 years ago

Hi, Hisat2 does not want a folder, but a truncated file name... For example, on my computer I have these files when building the index:

Homo_sapiens-GRCh38.p13.ht2.1.ht2
Homo_sapiens-GRCh38.p13.ht2.2.ht2
Homo_sapiens-GRCh38.p13.ht2.3.ht2
Homo_sapiens-GRCh38.p13.ht2.4.ht2
Homo_sapiens-GRCh38.p13.ht2.5.ht2
Homo_sapiens-GRCh38.p13.ht2.6.ht2
Homo_sapiens-GRCh38.p13.ht2.7.ht2
Homo_sapiens-GRCh38.p13.ht2.8.ht2

and to work with this, I pass to TADbit: --index Homo_sapiens-GRCh38.p13.ht2

eng3001 commented 3 years ago

Hello, Thank you for the help. I have hisat2 files named:

S_obliquus.1.ht2
S_obliquus.2.ht2
S_obliquus.3.ht2
S_obliquus.4.ht2
S_obliquus.5.ht2
S_obliquus.6.ht2
S_obliquus.7.ht2
S_obliquus.8.ht2

and tried --index HiSat2_Index/S_obliquus.ht2 and am still running into the following error: OSError: ERROR: index file not found at HiSat2_Index/S_obliquus.ht2

fransua commented 3 years ago

Ok, than I understand it should be --index HiSat2_Index/S_obliquus as you tried before... Can you try the same command adding --mapper hisat2 --index HiSat2_Index/S_obliquus

eng3001 commented 3 years ago

Thank you all for the help! --mapper hisat2 --index HiSat2_Index/S_obliquus ran successfully.