carnegie / PlantClusterFinder

GNU General Public License v3.0
13 stars 1 forks source link

Plant cluster finder crashing #3

Closed amit4mchiba closed 2 years ago

amit4mchiba commented 2 years ago

Hi,

I am writing here to seek your help regarding running the PCF program. I followed all the instructions, got the pgdb, and then run PCF, but it seems to give me errors and crash. Please find below the log file. I will be sincerely grateful if you could advise me here-

Setting up environment variables

LD_LIBRARY_PATH is .:/mnt/ssd/amit8chiba/software/v91/runtime/glnxa64:/mnt/ssd/amit8chiba/software/v91/bin/glnxa64:/mnt/ssd/amit8chiba/software/v91/sys/os/glnxa64:/mnt/ssd/amit8chiba/software/v91/sys/opengl/lib/glnxa64 Parameter 1: -pgdb Value: /mnt/md0/Gene_cluster_analysis/Multi_species_PCF_run/GGlur_kyotoline_30052022/GGlur_PCF_run/pgdb/gkyoto_cyccyc/1.0/data/ Parameter 2: -rmdf Value: /mnt/md0/Gene_cluster_analysis/Multi_species_PCF_run/GGlur_kyotoline_30052022/GGlur_PCF_run/Inputs/ReactionMetabolicDomainClassification.txt Parameter 3: -md Value: {'Amines and Polyamines Metabolism'; 'Amino Acids Metabolism'; 'Carbohydrates Metabolism'; 'Cofactors Metabolism'; 'Detoxification Metabolism'; 'Energy Metabolism'; 'Fatty Acids and Lipids Metabolism'; 'Hormones Metabolism'; 'Inorganic Nutrients Metabolism'; 'Nitrogen-Containing Compounds'; 'Nucleotides Metabolism'; 'Phenylpropanoid Derivatives'; 'Polyketides'; 'Primary-Specialized Interface Metabolism'; 'Redox Metabolism'; 'Specialized Metabolism'; 'Sugar Derivatives'; 'Terpenoids'} Parameter 4: -psf Value: /mnt/md0/Gene_cluster_analysis/Multi_species_PCF_run/GGlur_kyotoline_30052022/GGlur_PCF_run/Glur_r1.01_pep_HC_8chrs.fasta Parameter 5: -gtpf Value: /mnt/md0/Gene_cluster_analysis/Multi_species_PCF_run/GGlur_kyotoline_30052022/GGlur_PCF_run/gtpf_GGlur.txt Parameter 6: -glof Value: /mnt/md0/Gene_cluster_analysis/Multi_species_PCF_run/GGlur_kyotoline_30052022/GGlur_PCF_run/glof_GGlur.gene.gff3.txt Parameter 7: -dnaf Value: /mnt/md0/Gene_cluster_analysis/Multi_species_PCF_run/GGlur_kyotoline_30052022/GGlur_PCF_run/Glur_r1.0_hardmask_gene_predicted_8chr.fa Parameter 8: -sitf Value: /mnt/md0/Gene_cluster_analysis/Multi_species_PCF_run/GGlur_kyotoline_30052022/GGlur_PCF_run/Inputs/scaffold-tailoring-reactions-05082016.tab Parameter 9: -gout Value: /mnt/md0/Gene_cluster_analysis/Multi_species_PCF_run/GGlur_kyotoline_30052022/GGlur_PCF_run/Vaisu_GGlur_ARAGene1_3_09062022_memex.txt Parameter 10: -cout Value: /mnt/md0/Gene_cluster_analysis/Multi_species_PCF_run/GGlur_kyotoline_30052022/GGlur_PCF_run/Vaisu_GGlur_ARAClust1_3_09062022_memex.txt Parameter 11: SeqGapSizesChromBreak Value: [10000] Parameter 12: PGDBIdsToMap Value: GTP Parameter 13: UnmaskedDNA Value: 1 Parameter 14: MinStepSize Value: 3 Parameter 15: EnzMinForClusters Value: 2 Initializing parameters and running PlantClusterFinder Check files for reading and writing Warning: Discarding metabolic domain information of Macromolecules Metabolism

In PlantClusterFinder>f_get_metabolic_domains (line 2340) In PlantClusterFinder (line 859) Warning: Discarding metabolic domain information of NA

In PlantClusterFinder>f_get_metabolic_domains (line 2340) In PlantClusterFinder (line 859) Warning: Discarding metabolic domain information of NOT IN METACYC

In PlantClusterFinder>f_get_metabolic_domains (line 2340) In PlantClusterFinder (line 859) Warning: Discarding metabolic domain information of Other

In PlantClusterFinder>f_get_metabolic_domains (line 2340) In PlantClusterFinder (line 859) Warning: Discarding metabolic domain information of Unclassified

In PlantClusterFinder>f_get_metabolic_domains (line 2340) In PlantClusterFinder (line 859) Error using PlantClusterFinder>f_analyze_PlantClusterGapFile Too many input arguments.

Error in PlantClusterFinder>f_get_Sequencing_Gaps (line 2168)

Error in PlantClusterFinder>f_annotate_Sequencing_Gaps (line 1984)

Error in PlantClusterFinder (line 1124)

MATLAB:TooManyInputs

thank you so much

regards Amit

amit4mchiba commented 2 years ago

Dear Pascal,

Thank you so much for all your support. I was able to overcome the issue through your help, and here providing the way out from my errors.

Firstly, It is probably important to include path to PCF (folder which has all scripts including complied PCF script. This helped. In many cased, the issue was related to read-write-exe permission. I used following commands and it worked- Before running the program, I executed following command- find ../PlantClusterFinder -type d -exec chmod a+rwx {} \; find ../PlantClusterFinder -type f -exec chmod a+rwx {} \; This step allowed that the folders have rwxrwxrwx premissions.

Next, I run following- /home/amit8chiba/miniconda2/bin/awk \ -f enter_new_line_characters_in_fasta_file.awk \ '../assembly_gene_predicted_8chr_wrapped.fa' > '../assembly_gene_predicted_8chr_wrapped.fa_temp1'

tr -d '\r' < '../assembly_gene_predicted_8chr_wrapped.fa_temp1' > '../assembly_gene_predicted_8chr_wrapped.fa_temp2'

tr -d '\n' < '../assembly_gene_predicted_8chr_wrapped.fa_temp2' > '../assembly_gene_predicted_8chr_wrapped.fa_temp3'

/home/amit8chiba/miniconda2/bin/awk \ -f get_new_line_in.awk '../assembly_gene_predicted_8chr_wrapped.fa_temp3' > '../assembly_gene_predicted_8chr_wrapped.fa_temp4'

/home/amit8chiba/miniconda2/bin/awk \ -f get_positions_of_gap.awk '../assembly_gene_predicted_8chr_wrapped.fa_temp4' > '../assembly_gene_predicted_8chr_wrapped.fa_GAPOutput'

After these steps, I just followed the instructions as mentioned in the github, and could get all the expected results. PCF command on test data- find ../PlantClusterFinder -type d -exec chmod a+rwx {} \; find ../PlantClusterFinder -type f -exec chmod a+rwx {} \;

/home/amit8chiba/miniconda2/bin/awk \ -f enter_new_line_characters_in_fasta_file.awk \ './csubellipsoidea/CsubellipsoideaC_169_227_v2.0.hardmasked.fa' > './csubellipsoidea/CsubellipsoideaC_169_227_v2.0.hardmasked.fa_temp1'

tr -d '\r' < './csubellipsoidea/CsubellipsoideaC_169_227_v2.0.hardmasked.fa_temp1' > './csubellipsoidea/CsubellipsoideaC_169_227_v2.0.hardmasked.fa_temp2'

tr -d '\n' < './csubellipsoidea/CsubellipsoideaC_169_227_v2.0.hardmasked.fa_temp2' > './csubellipsoidea/CsubellipsoideaC_169_227_v2.0.hardmasked.fa_temp3'

/home/amit8chiba/miniconda2/bin/awk \ -f get_new_line_in.awk './csubellipsoidea/CsubellipsoideaC_169_227_v2.0.hardmasked.fa_temp3' > './csubellipsoidea/CsubellipsoideaC_169_227_v2.0.hardmasked.fa_temp4'

/home/amit8chiba/miniconda2/bin/awk \ -f get_positions_of_gap.awk './csubellipsoidea/CsubellipsoideaC_169_227_v2.0.hardmasked.fa_temp4' > './csubellipsoidea/CsubellipsoideaC_169_227_v2.0.hardmasked.fa_GAPOutput'

./run_PlantClusterFinder.sh /mnt/ssd/amit8chiba/software/v91/ \ -pgdb './csubellipsoidea/pgdb/csubellipsoideacyc/1.0/data/' \ -rmdf './Inputs/ReactionMetabolicDomainClassification.txt' \ -md "{'Amines and Polyamines Metabolism'; 'Amino Acids Metabolism'; 'Carbohydrates Metabolism'; 'Cofactors Metabolism'; 'Detoxification Metabolism'; 'Energy Metabolism'; 'Fatty Acids and Lipids Metabolism'; 'Hormones Metabolism'; 'Inorganic Nutrients Metabolism'; 'Nitrogen-Containing Compounds'; 'Nucleotides Metabolism'; 'Phenylpropanoid Derivatives'; 'Polyketides'; 'Primary-Specialized Interface Metabolism'; 'Redox Metabolism'; 'Specialized Metabolism'; 'Sugar Derivatives'; 'Terpenoids'}" \ -psf './csubellipsoidea/CsubellipsoideaC_169_227_v2.0.protein.pcf13.fa' \ -gtpf './csubellipsoidea/gtpf_CsubellipsoideaC_169_227_v2.0.annotation_info.txt.txt' \ -glof './csubellipsoidea/glof_CsubellipsoideaC_169_227_v2.0.gene.gff3.txt' \ -dnaf './csubellipsoidea/CsubellipsoideaC_169_227_v2.0.hardmasked.fa' \ -sitf './Inputs/scaffold-tailoring-reactions-05082016.tab' \ -gout './csubellipsoidea/Vaisu_ARAGene1_3_09062022_memex.txt' \ -cout './csubellipsoidea/Vaisu_ARAClust1_3_09062022_memex.txt' \ SeqGapSizesChromBreak '[15000]' PGDBIdsToMap GTP

Again, Many many thanks Pascal for your support.

regards Amit