wejlab / MetaScope

An R-based approach for preprocessing and aligning 16S, metagenomic, and metatranscriptomic data (PathoScope version 3.0)
GNU General Public License v3.0
16 stars 7 forks source link

Errror generating target and filter index files #18

Closed VanishingRasengan closed 1 year ago

VanishingRasengan commented 2 years ago

Dear @aubreyodom,

first of all, thank you very much for developing and maintaining this amazing package! After having several troubles using PathoScope, I really want to use MetScope for metagenomic profiling so I went through the Tutorial. I ran every code from the tutorial, however I always get an error when aligning the sample reads to the target reference genomes.

Particulary, at this step:

target_map <-
  MetaScope::align_target_bowtie(
    read1 = readPath,
    lib_dir = index_temp,
    libs = "target",
    align_dir = output_temp,
    align_file = "bowtie_target",
    overwrite = TRUE
  )

I get:

Attempting to perform Bowtie2 alignment on target index
Error in Rbowtie2::bowtie2_samtools(bt2Index = file.path(lib_dir, libs[i]),  : 
  Could not find either a valid small (.bt2) or large (.bt2l) index with basename of target at location C:/Users/Hashirama/AppData/Local/Temp/Rtmp6JIv6I/file273465b11a3d

Checking the mentioned folder, there is really no file in there.

It even start before when create target index it says:

> # Create target index
> MetaScope::mk_bowtie_index(
+   ref_dir = target_ref_temp,
+   lib_dir = index_temp,
+   lib_name = "target",
+   overwrite = TRUE
+ )
Successfully built the Bowtie2 indexes
[1] "C:/Users/Hashirama/AppData/Local/Temp/Rtmp6JIv6I/file273479144471"

But there is no file in this folder too. And also no filter index is generated into this file... Where is my mistake?

I would be really glad if you could help me!

Kind regards, VanishingRasengan

aubreyodom commented 2 years ago

Hi VanishingRasengan!

This isn't very clear from our documentation, but this package isn't quite ready for the spotlight, for this issue and several others. I can add a note to the README so that this is more clear. I don't have a timeline on when this package will be complete, but I would advise you to keep using PathoScope in the meantime until MetaScope is fully functioning. I can let you know when that happens!

Brie

aubreyodom commented 2 years ago

Also, if you're having issues with PathoScope, I'm happy to help troubleshoot. You can contact me at aodom@bu.edu

aubreyodom commented 1 year ago

Hi VanishingRasengan!

I've fixed a bunch of the major efficiency issues, so that MetaScope should be working more efficiently now. You should reinstall the package.

With regards to the index files not being identified - when you are using download_refseq, where are you saving the initial files to? They should be in your home directory, which I assume you are then moving to this temporary folder to create the indices.

susheelbhanu commented 1 year ago

Hi @aubreyodom,

Sorry for reopening this issue, but I'm running into the same error. Is it my understanding that metascope expects the output of the download_refseq and the index files to be in the same folder?

I have the following:

# refseq download
/hdd0/susbus/databases/metascope/bacteria.fasta.gz

# indices
/hdd0/susbus/databases/metascope/index/target.4.bt2l

Should both the files be named as target and placed in the same directory?

Thanks for your help!

susheelbhanu commented 1 year ago

Update: The issue stemmed from improper index generation, which required a lot more resources. Also, the name of the file didn't matter, as long as the basename was provided in the command.

Thanks for a great tool @aubreyodom!

susheelbhanu commented 1 year ago

@VanishingRasengan you likely need more CPUs and RAM for the index generation. I just built one from the entire refseq and it kept running through, but didn't create the files since I didn't have enough resource.

It was fixed by giving more threads and memory! Make sure you use the threads flag, 'cos the default option only uses 1 thread.

target_map <-
  MetaScope::align_target_bowtie(
    read1 = readPath,
    lib_dir = index_temp,
    libs = "target",
    align_dir = output_temp,
    align_file = "bowtie_target",
    overwrite = TRUE,
    threads=48 # or 72 or 96 {whatever your capacity is}
  )
aubreyodom commented 1 year ago

Thanks @susheelbhanu ! I was going to take a look today but you beat me to it.