H2muller / CROPSR

CROPSR is a python tool designed for genome-wide gRNA design and evaluation for CRISPR experiments, with special focus on complex genomes such as those found in energy-producing crops. CROPSR is a product of the DOE Center for Advanced Bioenergy and Bioproducts Innovation (CABBI).
Apache License 2.0
8 stars 8 forks source link

test tun failed #4

Closed anandksrao closed 2 years ago

anandksrao commented 2 years ago

After what appeared to be a normal install, I executed the test run as listed on your main GitHub page, and it's entire STDOUT is copy / pasted below. As you can see, everything but the expected final line was generated.

Instead of the expected final line, shown below, I see errors returned from lines 476 and 523 of the CROPSR.py script. The output file has been generated at sample_data/output.csv

Is this because my CROPSR is installed on a Mac OS X local machine, not a UNIX / LINUX based cluster (mentioned in your paper)? Will CROPSR NOT run on MacOSX - ignoring size limits / run time issues!?

But if CROPSR can and will indeed run on Mac OS X, could you please help troubleshoot this error message?

Thank you in advance.

MacBook-Pro:CROPSR anand$ python3 CROPSR.py -f sample_data/sample_genome.fa -g sample_data/sample_genome.gff -o sample_data/sample_genome_output.csv --cas9 -v

################################################################################
##                                                                            ##
##                                                                            ##
##          .o88b.   d8888b.    .d88b.    d8888b.   .d8888.   d8888b.         ##
##         d8P  Y8   88  `8D   .8P  Y8.   88  `8D   88'  YP   88  `8D         ##
##         8P        88oobY'   88    88   88oodD'   `8bo.     88oobY'         ##
##         8b        88`8b     88    88   88ººº       `Y8b.   88`8b           ##
##         Y8b  d8   88 `88.   `8b  d8'   88        db   8D   88 `88.         ##
##          `Y88P'   88   YD    `Y88P'    88        `8888Y'   88   YD         ##
##                                                                            ##
##                                                                            ##
################################################################################
U.S. Dept. of Energy's Center for Advanced Bioenergy and Bioproducts Innovation
University of Illinois at Urbana-Champaign

        You are currently utilizing the following settings:

        CROPSR version:                                 1.11b
        Path to genome file in FASTA format:            sample_data/sample_genome.fa
        Path to output file:                            sample_data/sample_genome_output.csv
        Length of the gRNA sequence:                    20
        Length of flanking region for verification:     200
        Number of available CPUs:                       4
        Path to annotation file in GFF format:          sample_data/sample_genome.gff
        Path to annotation_info file in TXT format:     None
        Designing for CRISPR system:
            Streptococcus pyogenes Cas9                 True

Genome file sample_data/sample_genome.fa successfully imported
formatting genome
Genome file sample_data/sample_genome.fa successfully formatted
The genome was successfully converted to a dictionary
Annotation file sample_data/sample_genome.gff successfully imported
Annotation database successfully generated

            Initiating PAM site detection.

            Please wait, this may take a while...

                17314 Cas9 PAM sites were found on Chr01

Traceback (most recent call last):
  File "CROPSR.py", line 523, in <module>
    main()
  File "CROPSR.py", line 476, in main
    if args.cpf1:
AttributeError: 'Namespace' object has no attribute 'cpf1'
H2muller commented 2 years ago

CROPSR is fully functional in Mac OS. I have developed the code in both Mac and Linux environments, so that should not be a problem.

From your log message, it seems as you have uncommented the functions for execution of the CPF1 pipeline (lines 476 - 502) without uncommenting all the other required parts to make it work. This pipeline is commented out because it has not been fully implemented and its use is not endorsed at this moment.

If you can, please try to run it without modifications to the code, so we can validate that this is the only issue.

Best wishes, Hans

anandksrao commented 2 years ago

Thanks for your reply, Hans.

I did not uncomment any lines during install, but based on your response above, I did go in and manually comment out lines 476 through 502.

This allowed successful execution of my run as outlined in your instructions, and I see correctly formatted output files as well. Thank you!

Your CROPSR Github home page mentions -

An option to output as a JSON following MongoDB formatting is also provided, requiring the pymongo library as an additional dependency.

I did install the pymongo library using your instructions $ pip3 install pymongo

But both under CROPSR arguments and Help Menu listed under step 4 of First Steps, I do not see that flag to send output to a JSON file. I apologize if I am failing to notice something obvious!? Could you please point me in the right direction? Thanks in advance

H2muller commented 2 years ago

I apologize, it must have been removed it in one of the updates and I hadn't realized the execution for the MongoDB conversion was not included in the instructions. You can manually convert the CSV to a MongoDB or a SQL-type database at your own discretion, whichever works best for your workflow. It is not a requirement for CROPSR functionality.

anandksrao commented 2 years ago

Gotcha!

BTW, do you think it might be easy to (also) output the sgRNA predictions in GFF3 format?

because...

I imagine it would be easy and very useful for end user like myself to use something like BEDTOOLS or BEDOPS to extract intersection between my target gene(s) of interest and CROPSR predictions using one or more of the several tools in these software? Your thoughts?

note to self about CSV to MongoDB conversion: Check out these links and implement whatever is easy / efficient / makes sense https://kb.objectrocket.com/mongo-db/how-to-import-a-csv-into-mongodb-327 https://medium.com/analytics-vidhya/import-csv-file-into-mongodb-9b9b86582f34 https://www.mongodb.com/community/forums/t/how-to-load-a-csv-file-to-mongodb-atlas/106571/2 https://www.mongodb.com/docs/database-tools/mongoimport/