BackofenLab / CRISPRcasIdentifier

Machine learning for accurate identification and classification of CRISPR-Cas systems
GNU General Public License v3.0
20 stars 6 forks source link

The output was not generated in the directory that I specified #4

Closed FA387 closed 1 year ago

FA387 commented 1 year ago

So I am using this pipeline to annotate the Cas Cascade of MAGs. I want run them all together so I make a new folder that will contains multiple folder for each MAGs.

I use this to run the pipeline python CRISPRcasIdentifier.py -f /data/users/avi123/metagenomes/ereum/MAGs_ER/bin.57.fa -st dna -sc complete -o /data/users/avi123/metagenomes/ereum/MAGs_ER/analysis/Cas_Cascade/bin.57_HQ

It generated the result but not in the desired directory

But It detect some error, I couldn't figure it out (I am still a novice in coding). Here was the error message:

`Running prodigal on DNA sequences Running hmmsearch (log and outputs stored in /data/users/avi123/metagenomes/ereum/MAGs_ER/analysis/Cas_Cascade/bin.57_HQ) Annotating proteins Building cassettes Saving cassette(s) to /data/users/avi123/metagenomes/ereum/MAGs_ER/analysis/Cas_Cascade/bin.57_HQ/HMM2019_cassette_arrays.txt


There are no unlabeled proteins for cassette # 1 and HMM2019

There are 2 unlabeled proteins for cassette # 2 and HMM2019 ERT missing bitscore prediction for cassette #2, HMM2019 and cas13 (1/2): 0.416399 ERT missing bitscore prediction for cassette #2, HMM2019 and cas12 (2/2): 0.171809

There are no unlabeled proteins for cassette # 3 and HMM2019

Loading classifiers and running classification Predictions for HMM2019 and ERT regressor

Cassette #1 -- ERT classifier: CAS-V-A

Cassette #2 -- ERT classifier: CAS-VI-B

Cassette #3 -- ERT classifier: CAS-I-C


Saving class predictions to /data/users/avi123/metagenomes/ereum/MAGs_ER/analysis/Cas_Cascade/bin.57_HQ Traceback (most recent call last): File "CRISPRcasIdentifier.py", line 434, in output_df.to_csv(args.output_file, index=False) File "/data/users/avi123/miniconda3/envs/CRISPRCasIdentifier/lib/python3.7/site-packages/pandas/core/generic.py", line 3228, in to_csv formatter.save() File "/data/users/avi123/miniconda3/envs/CRISPRCasIdentifier/lib/python3.7/site-packages/pandas/io/formats/csvs.py", line 183, in save compression=self.compression, File "/data/users/avi123/miniconda3/envs/CRISPRCasIdentifier/lib/python3.7/site-packages/pandas/io/common.py", line 399, in _get_handle f = open(path_or_buf, mode, encoding=encoding, newline="") IsADirectoryError: [Errno 21] Is a directory: '/data/users/avi123/metagenomes/ereum/MAGs_ER/analysis/Cas_Cascade/bin.57_HQ`

Can anyone help me please. Thank you in advanced! Best regards

padilha commented 1 year ago

It seems that /data/users/avi123/metagenomes/ereum/MAGs_ER/analysis/Cas_Cascade/bin.57_HQ is a directory in your machine. Note that the -o option specifies the output csv file path and in your case it conflicts with the directory that you informed. If you try something like

python CRISPRcasIdentifier.py -f /data/users/avi123/metagenomes/ereum/MAGs_ER/bin.57.fa -st dna -sc complete -o /data/users/avi123/metagenomes/ereum/MAGs_ER/analysis/Cas_Cascade/bin.57_HQ/predictions.csv

I believe it may work. If it doesn't, please let me know.

Best, Victor.

FA387 commented 1 year ago

Dear Dr. Victor. It worked, thank you for the solution

Have a good day Best regards -FA