Ecogenomics / GTDBTk

GTDB-Tk: a toolkit for assigning objective taxonomic classifications to bacterial and archaeal genomes.
https://ecogenomics.github.io/GTDBTk/
GNU General Public License v3.0
466 stars 83 forks source link

Troubleshooting GTDB-Tk Database Installation and Environment Configuration #596

Open soojunglee98 opened 2 months ago

soojunglee98 commented 2 months ago

Environment

Debugging information

Additional comments

As mentioned on this website, I installed the conda environment and tried to download the database.

When I run download.sh, it says:

"Cannot write to '/home/spotgiet/miniconda3/envs/gtdbtk-2.1.1/share/gtdbtk-2.1.1/db/gtdbtk_r207_v2_data.tar.gz' (No space left on device)."

So, I manually downloaded the database to another directory due to space limitations in my home directory. However, I encountered this error:


================================================================================
                                     ERROR                                      
________________________________________________________________________________

          The 'GTDBTK_DATA_PATH' environment variable is not defined.           

            Please set this variable to your reference data package.            
               https://github.com/Ecogenomics/GTDBTk#installation               
================================================================================

================================================================================
                                     ERROR                                      
________________________________________________________________________________

           The GTDB-Tk reference data does not exist or is corrupted.           
                GTDBTK_DATA_PATH=/path/to/unarchived/gtdbtk/data                

   Please compare the checksum to those provided in the download repository.    
          https://github.com/Ecogenomics/GTDBTk#gtdb-tk-reference-data          
================================================================================

So again, as mentioned on the website, I activated my conda environment and tried to run:

conda env config vars set GTDBTK_DATA_PATH="/scratch/raskin_root/raskin0/shared_data/Soojung_Sarah/gtdb_tk/release220"

But it keeps saying: "To make your changes take effect, please reactivate your environment." even though my conda environment is already activated. Any suggestions? Thank you so much!

pchaumeil commented 2 months ago

you need to deactivate and then reactivate the environment for the changes to be applied properly.

conda deactivate
conda activate your_gtdbtk_environment_name

This step is necessary because of the environment change.

soojunglee98 commented 2 months ago

Yes, I did that several times but it doesn't work. Do you have any more suggestions? Thank you!

On Mon, Jul 22, 2024 at 4:59 PM Pierre Chaumeil @.***> wrote:

you need to deactivate and then reactivate the environment for the changes to be applied properly.

conda deactivate conda activate your_gtdbtk_environment_name

This step is necessary because of the environment change.

— Reply to this email directly, view it on GitHub https://github.com/Ecogenomics/GTDBTk/issues/596#issuecomment-2243804245, or unsubscribe https://github.com/notifications/unsubscribe-auth/A6YIDR3Z3QF4XEIZZY62SY3ZNVXBNAVCNFSM6AAAAABK7EFKWGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBTHAYDIMRUGU . You are receiving this because you authored the thread.Message ID: @.***>

pchaumeil commented 1 month ago

Do you still have the exact same error? or is your error pointing to the new release220 folder now? i.e. is the error similar to

The GTDB-Tk reference data does not exist or is corrupted.           
GTDBTK_DATA_PATH=/path/to/unarchived/gtdbtk/data    

or

The GTDB-Tk reference data does not exist or is corrupted.           
GTDBTK_DATA_PATH=/scratch/raskin_root/raskin0/shared_data/Soojung_Sarah/gtdb_tk/release220