carlferr / HOME-BIO

HOME-BIO (sHOtgun MEtagenomic analysis of BIOlogical entities)
GNU Lesser General Public License v3.0
7 stars 2 forks source link

kraken2: unable to find Db_Kraken2_Kaiju_bacteria in $KRAKEN2_DB_PATH (undefined) #8

Open arpit20328 opened 4 months ago

arpit20328 commented 4 months ago

Hi after cd in HOME-BIO-master I ran following command

(bowtie2) arpit@jarvis:~/HOME-BIO-master$ /home/arpit/miniconda3/bin/python3.12 Script.py

Error that I get:

Running Shotgun Analysis Running kraken for bacteria kraken2: unable to find Db_Kraken2_Kaiju_bacteria in $KRAKEN2_DB_PATH (undefined) Error: cannot create Kraken file

config file : path kraken2 & kaiju databases = /home/arpit/krakendb_bacteria/

Path folder files:

image

Please let me know where I am going wrong.

arpit20328 commented 4 months ago

Following errors are also there:

image

BioH4z commented 4 months ago

Dear @arpit20328 , it seems everything correct. Let's try giving all permissions to the folders of the databases (chmod -R 777 /home/arpit/krakendb_bacteria/). If this does not work, please, try to move all the DB folders on the home separatly. In this way, you can just change the config file writing in the path "/home/"

Please, let me know if the errors persist.

arpit20328 commented 4 months ago

@BioH4z

I changed the Script.py like this

        # Running "kraken" for bacteria

                print("Running kraken for bacteria")
                output_log.write("Running kraken for bacteria\n")
                os.system("mkdir -p "+split_output_path+"/4_Kraken2_bacteria_Shotgun_Annotation")
                if (split_confidence == "default") or (split_confidence == "0.5"):
                    os.system("kraken2 --db /home/arpit/krakendb_bacteria/Db_Kraken2_Kaiju_bacteria.kraken --threads "+n_thread+" --unclassified-out "+split_output_path+"/4_Kraken2_bacteria_Shotgun_Annotation/"+sample_only_name_a_noR1[0]+"_unclass_seq.fastq --classified-out "+split_output_path+"/4_Kraken2_bacteria_Shotgun_Annotation/"+sample_only_name_a_noR1[0]+"_class_seq.fastq  --use-names --report "+split_output_path+"/4_Kraken2_bacteria_Shotgun_Annotation/"+sample_only_name_a_noR1[0]+"_report_kraken2.txt --output "+split_output_path+"/4_Kraken2_bacteria_Shotgun_Annotation/"+sample_only_name_a_noR1[0]+"_output_kraken2.txt --confidence 0.5 "+split_output_path+"/3_STAR_Alignment/"+sample_only_name_a_noR1[0]+"Unmapped.out.mate1.fastq")
                else:
                    os.system("kraken2 --db /home/arpit/krakendb_bacteria/Db_Kraken2_Kaiju_bacteria.kraken --threads "+n_thread+" --unclassified-out "+split_output_path+"/4_Kraken2_bacteria_Shotgun_Annotation/"+sample_only_name_a_noR1[0]+"_unclass_seq.fastq --classified-out "+split_output_path+"/4_Kraken2_bacteria_Shotgun_Annotation/"+sample_only_name_a_noR1[0]+"_class_seq.fastq  --use-names --report "+split_output_path+"/4_Kraken2_bacteria_Shotgun_Annotation/"+sample_only_name_a_noR1[0]+"_report_kraken2.txt --output "+split_output_path+"/4_Kraken2_bacteria_Shotgun_Annotation/"+sample_only_name_a_noR1[0]+"_output_kraken2.txt --confidence 0.5 "+split_output_path+"/3_STAR_Alignment/"+sample_only_name_a_noR1[0]+"Unmapped.out.mate1.fastq --confidence "+split_confidence)   

Is it ok to do this modification ?

BioH4z commented 4 months ago

Dear @arpit20328 , I am not so sure that the modification will work because we pick the database path and insert inside the docker container ( this will change the path). In brief, the config file option it is used only to know where to pick the file for the docker. Furthermore, if I do not remember wrong, I used to point the --db option to a folder not a file. Are you using a custom Database? In the our databases, that is possible to download from Zenodo, you just need to point the script to the KRAKENdb_bacteria folder and it will search for the files inside.

arpit20328 commented 4 months ago

Yes I am using your Zenodo Database

BioH4z commented 4 months ago

Have you tried to change the permissions? Because that seems a problem of the docker container. Please try also to change the path and put them on /home/ because this is also a well-known issue with docker. Do you have permissions on /home/?

arpit20328 commented 4 months ago

I do not have permission to do that . have a look . any solution to this issue ?

(bowtie2) arpit@jarvis:~/HOME-BIO-master$ chmod -R 777 /home/arpit/krakendb_bacteria/ chmod: changing permissions of '/home/arpit/krakendb_bacteria/KRAKENdb_protozoa': Operation not permitted chmod: changing permissions of '/home/arpit/krakendb_bacteria/KRAKENdb_bacteria': Operation not permitted chmod: changing permissions of '/home/arpit/krakendb_bacteria/KRAKENdb_viruses': Operation not permitted chmod: changing permissions of '/home/arpit/krakendb_bacteria/KAIJUdb': Operation not permitted

(bowtie2) arpit@jarvis:~/HOME-BIO-master$ chmod -R 777 /home/arpit/ chmod: changing permissions of '/home/arpit/krakendb_bacteria/KRAKENdb_protozoa': Operation not permitted chmod: changing permissions of '/home/arpit/krakendb_bacteria/KRAKENdb_bacteria': Operation not permitted chmod: changing permissions of '/home/arpit/krakendb_bacteria/KRAKENdb_viruses': Operation not permitted chmod: changing permissions of '/home/arpit/krakendb_bacteria/KAIJUdb': Operation not permitted

BioH4z commented 4 months ago

Dear @arpit20328 I'm very very sorry. This is a known issue that we have with Docker. You can found something similar in our issue section. Unfortunately, we don't have a solution because Docker requires permissions to read, write and also load the folder in the container for security reasons.

If you want, you can check also the dockerfile in my github page. We also added the user 'homebio' in order to give permissions to all the opreations required by the pipeline but it is not enough. We are trying to find a solution but at the moment I am not able to suggest you a feasible solution.

The only thing is to try our pipeline on Windows (it seems that there are no this kind of problems there). If you are able to find a solution, please, let us know

arpit20328 commented 4 months ago

@BioH4z got it.

Communicated with the admin of my lab..

I will try it in windows also..

thanks for the detailed reply...will get back to you very soon...

arpit20328 commented 4 months ago

@carlferr @BioH4z hi ... I think I am facing a fundamental problem here.

Admin of my lab has given following permissions: chmod -R 777 /home/arpit/krakendb_bacteria chmod -R 777 /home/arpit/krakendb_protozoa chmod -R 777 /home/arpit/krakendb_virus chmod -R 777 /home/arpit/kaijudb

Command I ran:

(bowtie2) arpit@jarvis:~/HOME-BIO-master$ /home/arpit/miniconda3/bin/python3.12 Script.py

Config file: config_file.txt

Script file: Script.txt

Screenshots:

image image

/home/arpit/krakendb_bacteria:

image

Error Screenshot:

image

kindly help in this regard.

BioH4z commented 4 months ago

Dear @arpit20328 , there are few things that are not clear to me: 1) in the config file I can see that input and output files are in the same folder. This could be a problem because it is possible that input and output will be confused or rewritten by the pipeline. 2) the databases names in /home/arpit should be written as in /home/arpit/krakendb_bacteria (capitalized) 3) did u tried to configure the config file with /home/arpit/krakendb_bacteria for the databases? Or you just tried in /home/arpit? 4) Why in /home/arpit/krakendb_bacteria there are also bacteria, protozoa, viruses and kaiju? Remember that the pipeline will search for specific names in the specific folders.

I will try to rebuild the dockerfile and see if there are problems with the pipeline

BioH4z commented 4 months ago

Dear @arpit20328, I checked the docker container and there is no issue. It works and it's able to run all the commands. I created for you a Dockerfile without the Entrypoint. You can build it on your system and use it for run the kraken and kaiju commands. Remember to add the -v option to load inside the container your folder like this: docker run --rm -it -v "$PWD:/home/arpit" [NAME OF THE CONTAINER]

I known that is not a solution but, in this way, you'll have a container with all the installed software necessary for your analysis.

Dockerfile.zip