CSB5 / OPERA-MS

OPERA-MS - Hybrid Metagenomic Assembler
Other
89 stars 17 forks source link

Error when run strain cluster #92

Open Xinpeng021001 opened 3 months ago

Xinpeng021001 commented 3 months ago

Hi,

I met an unexpected error when run the strain cluster:

(/common/yinlab/xinpeng/conda_env/operams) [xinpeng@login1.swan new]$ cat /lustre/work/yinlab/xinpeng/final_course_project/glacier_algae/process/part3_opera-ms/megahit/new/.//intermediate_files/strain_analysis/coverage_clustering.err *** Open file /common/yinlab/xinpeng/GTDB_db/OPERA-MS-DB/genomes_13/CAIKAD01_sp028703835__GCA_028703835.1_genomic.fna.gz Died at /work/yinlab/xinpeng/final_course_project/glacier_algae/script/OPERA-MS/bin/coverage_clustering.pl line 122, line 2.

I've checked that the fna.gz file is good to view.

Best Regards, Xinpeng

jsgounot commented 3 months ago

Hi,

I assume that the file might be a symlink. Can you confirm that the link is absolute? It is possible that OPERA-MS crashes because it is executed in a different folder if this is not the case.

JS

Xinpeng021001 commented 3 months ago

Hi JS,

I tried different path (both absolute and symlink), but it always gives me this error. I checked the pl script and guessed that it might be here:

    open(CHECK_REF, $strain_path) or die;

and the $strain_path is from here:

my ($strain, $strain_path);
while (<STRAINS>){
chomp $_;
if($_ =~ />(.*)/){
    $cluster = $1;
    chomp $cluster;
    #operate at a species level
    my $strain = <STRAINS>;
    chomp $strain;
    $strain_path = "$opera_ms_db/../$strain";

which makes me a little confused. I think $strain_path should be the same folder with $opera_ms_db because we did not create a specific strain database from the python script.

Best Regards, Xinpeng

jsgounot commented 3 months ago

Ok, honestly I'm not so sure what is happening. In step 5 of the wiki page, I indicate that the DB folder must be inside the OPERA-MS main folder, is that what you did? If you didn't do it, it is important to make the OPERA-MS-DB folder inside the OPERA-MS main folder and not somewhere else and then symlink each of the genome files inside it.

Otherwise, I know it's bit of a hassle but would you be able to edit your opera-ms perl file line 121 to change $strain to $strain_path, and then check the path it gives to you in the log files?

It is possible there are some discordance within OPERA-MS, as I've been trying to minimize my changes within the old codebase as much as possible.

JS

Xinpeng021001 commented 3 months ago

Hi JS,

Sorry for replying late. I think I misunderstand the way to make database "DB folder must be inside the OPERA-MS main folder". I did not make it under this path. Maybe you could add one extra command on the wiki page like cd OPERA-MS, which could be more clear and easy to understand.

Best Regards, Xinpeng

jsgounot commented 3 months ago

Sure, I'm hoping it works now.