BackofenLab / CRISPRloci

13 stars 6 forks source link

summary_identify.csv is missing #3

Open Edison2021 opened 2 years ago

Edison2021 commented 2 years ago

Hi Alex I run a trial test and found an error as below: Command python3.7 CRISPRloci_standalone.py -f NC_005230.fasta -output NC_005230.fasta.dir -st dna -cpu 32 Final output summary crispr Example/NC_005230.fasta.dir/summary_crisp.csv dirname cas Example/NC_005230.fasta.dir/tmp/output-Casboundary/predictions/ This file does not exist: Example/NC_005230.fasta.dir/summary_identify.csv

Best Edison

Edison2021 commented 2 years ago

In addition, there is another error reported: Error: Unable to access jarfile CRISPRloci_webserver_visualization/CRISPRloci_visualization.jar

I searched CRISPRloci folder but did not find the folder CRISPRloci_webserver_visualization.

Best Edison

niccw commented 2 years ago

@Edison2021

I also encountered the same error. After checking, one of the biggest problems is that environments.yml miss packages that are required for CRISPRidentify.

Alexander-Mitrofanov commented 2 years ago

Thank you for submitting the problems. The fixes will be applied at the beginning of April. For the time being please try to use the CRISPRloci web interface.

JPegorino commented 2 years ago

just in case it is helpful for bug fixes, I'm getting (I think) related errors with this at the end of April (running on a complete reference genome from NCBI) and I thought I'd copy them here:

cp: cannot stat '/home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/tmp/output-CRISPRidentify/GCA_000008485/CRISPR*': No such file or directory
cp: cannot stat '/home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/tmp/output-CRISPRidentify/GCA_000008485/Spacers*': No such file or directory
This file does not exist: /home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/tmp/output-Casboundary/predictions/
Error: Unable to access jarfile /home/ubuntu/software/CRISPRloci-1.0.0/CRISPRloci_webserver_visualization/CRISPRloci_visualization.jar
summary crispr /home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/summary_crisp.csv
dirname cas /home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/tmp/output-Casboundary/predictions/
summary crispr /home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/summary_crisp.csv
dirname cas /home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/tmp/output-Casboundary
This file does not exist: /home/ubuntu/volume2/pangenome/MGEs/CRISPR_Cas/test/summary_identify.csv

Many thanks, Jamie

Alexander-Mitrofanov commented 2 years ago

Thank you for the feedback. I'm starting to work on the fixes.

pjbiggs commented 2 years ago

Hi, I am wanting to run the code on hundreds of genomes. I am having a similar issue to those above, but I also note additional problems. When I ran the command for the first time, the 4 .tar.gz files were extracted for CASboundary. It did not find them automatically for CRISPRcasIdentifier. I extracted them manually and now have trained_models.tar.gz and HMM_sets.tar.gz in the CRISPRcasIdentifier folder along with their extracted folders. I also have the extracted folders in the root CRISPRloci-1.0.0 folder. I am running the test command: python3.7 CRISPRloci_standalone.py -f Example/NC_005230.fasta -st dna -output test1 into a new results folder. My errors are as below:

`1. Run initial array detection

  1. Refine detected arrays
  2. Evaluate candidates
  3. Enhance evaluated arrays
  4. Complement arrays with additional info Traceback (most recent call last): File "components/module_non_array_computations.py", line 164, in _calculate_strand st = StrandComputationNew(list_of_crisprs=self.list_of_crisprs_bona_fide) File "components/components_non_array_computations.py", line 111, in init self._compute_all_strands() File "components/components_non_array_computations.py", line 134, in _compute_all_strands with open("ResultsStrand/CRISPRstrand_Summary.tsv", "r") as f: FileNotFoundError: [Errno 2] No such file or directory: 'ResultsStrand/CRISPRstrand_Summary.tsv'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "CRISPRidentify.py", line 249, in run_over_one_file(complete_path_file, folder_result, pickle_folder) File "CRISPRidentify.py", line 210, in run_over_one_file flag_dev_mode=FLAG_DEVELOPER_MODE) File "components/pipeline.py", line 32, in init self._run_non_crispr_computation() File "components/pipeline.py", line 75, in _run_non_crispr_computation flag_dev_mode=self.flag_dev_mode) File "components/module_non_array_computations.py", line 34, in init self._calculate_all_non_array_values() File "components/module_non_array_computations.py", line 46, in _calculate_all_non_array_values self._calculate_strand() File "components/module_non_array_computations.py", line 171, in _calculate_strand st = StrandComputation(list_of_crisprs=self.list_of_crisprs_bona_fide) File "components/components_non_array_computations.py", line 93, in init self._compute_all_strands() File "components/components_non_array_computations.py", line 98, in _compute_all_strands strand = get_orientation(consensus) File "components/components_non_array_computations.py", line 48, in get_orientation f = open("prediction", "r") FileNotFoundError: [Errno 2] No such file or directory: 'prediction' Error: Unable to access jarfile /home/pbiggs/software/CRISPRloci-1.0.0/CRISPRloci_webserver_visualization/CRISPRloci_visualization.jar summary crispr /home/pbiggs/software/CRISPRloci-1.0.0/test1/summary_crisp.csv dirname cas /home/pbiggs/software/CRISPRloci-1.0.0/test1/tmp/output-Casboundary/predictions/ This file does not exist: /home/pbiggs/software/CRISPRloci-1.0.0/test1/summary_identify.csv`

I am running this in WSL2 using Ubuntu 20.04, and I have no issues with conda environments. Any thoughts please on how to solve this? Thanks, Patrick

dxter-zz commented 1 year ago

Regarding the Readme example, as pjbiggs mentioned above, the tar archives in CRISPRcasIdentifier are not being auto extracted as indicated in the Readme.

Also, the example:

python3.7 CRISPRloci_standalone.py -f Example/NC_005230_proteins.fasta -st protein

should be

python3.7 CRISPRloci_standalone.py -f Example/NC_005230_proteins.fa -st protein

and I'm not seeing Example/Input3.fa for the virus example