Arkadiy-Garber / FeGenie

HMM-based identification and categorization of iron genes and iron gene operons in genomes and metagenomes
GNU Affero General Public License v3.0
53 stars 10 forks source link

Error - ValueError: could not convert string to float: 'EMPTY' #32

Open JBuongio opened 2 years ago

JBuongio commented 2 years ago

Hi there, I am aiming to make plots with coverage data for several bam files. I get .depth files and .csv files (as well as ORF_calls and HMM_results. However, there are no plots. Here is my input and the error:


FeGenie.py -bin_dir /home/lloydlab/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fasta_files/Indiv_Anvio_fastas -bin_ext fasta -out /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output/Plots -t 16 -bams Bam_file_for_fegenie.txt --makeplots --norm

Thread 0 finished: VKAB123_transcripts_MAGs_no_ribo.bam with 15432 reads and 8719 readsWellMapped Creating depth matrix file: /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output/Plots/Woeseia_stnF.depth Closing most bam files Closing last bam file Finished processing... Woeseia_stnF Traceback (most recent call last): File "/home/lloydlab/anaconda3/envs/fegenie/bin/FeGenie.py", line 3020, in main() File "/home/lloydlab/anaconda3/envs/fegenie/bin/FeGenie.py", line 2586, in main Dict[cell][process].append(float(depthDict[cell][contig])) ValueError: could not convert string to float: 'EMPTY'


Thank you!

Arkadiy-Garber commented 2 years ago

Hi Joy,

Sorry about this issue. I think I have an idea of what is causing it though. Would you be able to share the final output .csv files that were generated, along with the .depth file? That would help me narrow this down.

Thanks, Arkadiy

JBuongio commented 2 years ago

Hi Arkadiy, Thanks for getting back to me. I have many .depth files! Here's a google folder with the .depth and .csv files. Please request access and I'll grant it :)

Best, Joy

Arkadiy-Garber commented 2 years ago

Hi Joy,

Thanks for sharing that. That's helpful. Could you please also please provide this file: "Bam_file_for_fegenie.txt".

Thanks, Arkadiy

Arkadiy-Garber commented 2 years ago

Actually, is it fair to assume that in your provided "Bam_file_for_fegenie.txt", the first column of each row contains the genome/bin name that does not include the '.fasta' extension? If so, could you please try re-running with the .fasta extension included in the bin names? FeGenie expects the first column of that file to contain bin names exactly as they appear in the second column of the "FeGenie-geneSummary.csv" file. Does that make sense?

JBuongio commented 2 years ago

Bam_file_for_fegenie.txt

Your suggestion worked! I got a .tiff dotplot after amending the Bam_file_for_fegenie.txt file to include the fasta extensions. Would there be more than one dotplot, though? One for each sample (metatranscriptome)? Thank you!

Arkadiy-Garber commented 2 years ago

Hey Joy,

Great! Glad to hear that including the .fasta extensions worked. I will edit the wiki/readme to make this input more clear for users.

Currently, FeGenie creates a single dotplot that is based on the total average coverage (third column in the generated depth files). However, I just added a flag to the program -which_bams, where you can specify which BAM file from your "Bam_file_for_fegenie.txt" file. Here is how this works: if you want dotplot to correspond to the first BAM file listed in that file, then you can set -which_bams 1, for the third one, you'd set -which_bams 3. You'd have to re-run the program multiple times to generate a dotplot for each BAM file, but if you include the --skip flag, FeGenie will skip the main part of the pipeline, and just proceed to the heatmap/dotplot generation. Does that make sense? Let me know if you have any questions.

Thanks, Arkadiy

JBuongio commented 2 years ago

Thanks for this! I reinstalled FeGenie with conda and ran the following: FeGenie.py -bin_dir /home/lloydlab/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fasta_files/Indiv_Anvio_fastas -bin_ext fasta -out /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_1 -t 16 -bams Bam_file_for_fegenie.txt --which_bams 1 --skip --makeplots --norm

Even though I used "--skip", it still was finding ORFs....anyway, it errored out:

Finding ORFs for Desulfosarcina_stnAB.fasta Traceback (most recent call last): File "/home/lloydlab/anaconda3/envs/fegenie/bin/FeGenie.py", line 3020, in main() File "/home/lloydlab/anaconda3/envs/fegenie/bin/FeGenie.py", line 2744, in main if args.hbm: AttributeError: 'Namespace' object has no attribute 'hbm'

Am I supposed to run this in the directory with the .depth files?

Arkadiy-Garber commented 2 years ago

Ah, shoot - stupid bug. Just fixed it and pushed the changes to the GitHub repo.

It shouldn't be finding ORFs if you provide the same output directory as you did in your initial run. Are you by chance providing a new directory name to correspond to the specific -which_bams argument? If so, then --skip will not work. I just tweaked the script again, so that you can now re-run as many times as you want with different -which_bams arguments, and the heatmap/dotplots that will be generated will have your argument in the filename (so it will not overwrite the previously produced heatmap/dotplot outputs). Does that make sense? You should actually be able to put FeGenie into a for-loop, like so:

for i in {1..10}; do
FeGenie.py -bin_dir /home/lloydlab/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fasta_files/Indiv_Anvio_fastas -bin_ext fasta -out /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/original_output_dir -t 16 -bams Bam_file_for_fegenie.txt --which_bams $i --skip --makeplots --norm
done

Also, i just remembered that I need to specify a new release of FeGenie, and then the bioconda bot will push the new changes to the FeGenie recipe in bioconda. And that actually takes a few days. In other words, all the changes that I made to FeGenie today will not show up on your end if you just redo the conda install. You'd have to download the code from GitHub via:

git clone https://github.com/Arkadiy-Garber/FeGenie.git
cd FeGenie
bash setup-conda-env.sh
conda activate fegenie

I just addded the setup-conda-env.sh file to the repository, and it should create a FeGenie-compatible conda environment for you, using the updated code that is in this repository right now. Does this make sense? Sorry if this is all confusing...I'm still learning about all this conda and github stuff myself.

Thanks, Arkadiy

JBuongio commented 2 years ago

Hi Arkadiy, Thank you so much for your hard work on this and quick replies. I was indeed referring to a different directory in the last run, so this makes sense.

A for loop was exactly my next step after an initial run to make sure I understood that it worked! Thank you for providing it ;)

I will do a reinstall with the instructions you provided, thank you! I'll let you know how it goes.

JBuongio commented 2 years ago

Hi again,

It seems that FeGenie.py is not in the path any longer for me to summon without providing the absolute file location to the script. I'm trying a run on everything at once again (without the --skipand without --which_bams arguments for now) and got this:

(fegenie) lloydlab@barbara:~/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie$ /home/lloydlab/FeGenie/FeGenie.py -bin_dir /home/lloydlab/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fasta_files/Indiv_Anvio_fastas -bin_ext fasta -out /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone -t 16 -bams Bam_file_for_fegenie.txt --makeplots --norm Traceback (most recent call last): File "/home/lloydlab/FeGenie/FeGenie.py", line 519, in main test = open(bits) FileNotFoundError: [Errno 2] No such file or directory: '/HMM-bitcutoffs.txt'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/lloydlab/FeGenie/FeGenie.py", line 3135, in main() File "/home/lloydlab/FeGenie/FeGenie.py", line 527, in main location = allButTheLast(location, "/") UnboundLocalError: local variable 'location' referenced before assignment

**and then just a test run with test_run.sh: (fegenie) lloydlab@barbara:~/FeGenie$ bash test_run.sh test_run.sh: line 1: FeGenie.py: command not found

I promise I ran setup-conda-env.sh!! :)

Arkadiy-Garber commented 2 years ago

oh shoot, sorry for the delay in getting back to this. Let me see what might be going on.

Arkadiy-Garber commented 2 years ago

So there were a couple of issues with the setup-conda-env.sh file, which I have not fixed. But it seems that in order for this to work, you need to manually input FeGenie into your PATH in the .bash_profile file, which should be in your home directory. Alternatively, this file may be called .profile. Could you please delete the FeGenie repository that you have downloaded previously, and try again with the bash setup-conda-env.sh command. After this, please enter the FeGenie directory into your .bash_profile. For example: PATH=$PATH:$HOME/bin/FeGenie. Does this make sense? Let me know if you have any questions or issues with this. Sorry, don't mean to be treating you like a guinea pig!

JBuongio commented 2 years ago

HI Arkadiy, Thanks for getting back to me. I finally had time to perform the solution you provided. The test run finished without crashing! My attempt to run with all of my samples at once is also successful so far. Once this is completed, I will use the for loop you provided. Best, Joy

JBuongio commented 2 years ago

Hmm, spoke too soon-- Command: (fegenie) lloydlab@barbara:~/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie$ FeGenie.py -bin_dir /home/lloydlab/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fasta_files/Indiv_Anvio_fastas -bin_ext fasta -out /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone -t 16 -bams Bam_file_for_fegenie.txt --makeplots --norm

Error: Looking for Thermincola S-layer cytochromes and Geobacter-related porin-cytochrome operons Pre-processing of final outout file Counting heme-binding motifs Final processing of output

iron_oxidation,Acidobacteria_stnAC.fasta,c_000000000041_24,Cyc2_repCluster3,88.4,27.3,558,1,MSRTIGSLSRCSMAGAALFSLAALAILCATPANAIPAFARKYETSCQTCHVAYPKLNTFGQAFRLLGYRMPGETEGQVKRPDVALGAASYKRVWPDAVWPGAIPQNLPLSLVANFQVQNSSQIEIEDGEVHRDTVNNDMIFPSEVALVVAGTAGEHVSYFGEIGFEQSVEAGMIEQEVGVEHIDIRFIRPIRNSMAFNVKIGSFQPELVSGFDHARRLTVANYDSMFGVSPIQSGGTEIVGGGGHHGGGGGISLPAVGRGIDLYGVVSHRFTWAAGILNGIGPGDATFDANSGKDTYVKGAYKWGGLAPDGSNAAVYAGSPKNWREKSVQVGVFYYRGDGKDIFFREEHDELGVIEVIFVEDPDYTRTGLDFNWYFKDLNIFGAYVSGDDDLRIYADSVTEPGEPGVFDPDESGTYTYKSWFVEADMVLGMPWLHGAVRYETVDLPRAEDGLKVQAYERATLSMTALVRANVKGVMEYTEDLNESRNYQFWLGAGIAF iron_oxidation,Woeseia_stnF.fasta,c_000000000167_9,Cyc2_repCluster3,56.8,27.3,273,1,MCNQVRTITVAVWVLAATCFWHEPVDAMPAFARQYSVSCNVCHAAYPRLNDFGETFAGDMNFRLPNWRDNTVQAGDETLALPKSLPLALRLQAFVQGRDGETIDPVSGEIAADSSFDFQSPYLLKLLSSAPLTDHISYYVYAIFAEKGGNTEVIVEDAWFSHDDLFNTGVGAQLGQFQVSDLMFPREVRMTFQDFMAYRMAGITYDRGILFGKGLGPVDMSLGFVNGNGIEQNFKINSPGYKRPDRMFDNDTQKSVFGRLGTDFGPVSVGLFGLSGSQQNATGPAGLDTGQRRTDKIVAGIDLSGNIDGNWYWFGQYLWNSWDGFLDPAIDYEWSGGFVGVDYIRSEKWVFSALVNHTDAGDLKNTDTIYEGIDMSTVTFTSSYYFMRNVKGQIEVSLDLLDEEPQTGLYFTGHLSKENYILIGIDAAF Writen summary to file: /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone/FeGenie-geneSummary-clusters.csv for visual inspection Writen summary to file: /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone/FeGenie-geneSummary.csv for downstream parsing and analyses Writing heatmap-formatted output file: /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone/FeGenie-heatmap-data.csv

processing... Acidimicrobiia_stnAC sh: 1: jgi_summarize_bam_contig_depths: not found processing... Acidimicrobiia_stnAC Traceback (most recent call last): File "/home/lloydlab/FeGenie/FeGenie.py", line 2632, in main depth = open("%s/contigDepths/%s.depth" % (args.out, cell)) FileNotFoundError: [Errno 2] No such file or directory: '/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone/contigDepths/Acidimicrobiia_stnAC.depth'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/lloydlab/FeGenie/FeGenie.py", line 3135, in main() File "/home/lloydlab/FeGenie/FeGenie.py", line 2648, in main depth = open("%s/%s.depth" % (outDirectory, cell)) FileNotFoundError: [Errno 2] No such file or directory: '/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone/Acidimicrobiia_stnAC.depth'

Arkadiy-Garber commented 2 years ago

Crud, sorry about that. Looks like I left jgi_summarize_bam_contig_depths (from the Metabat2 conda package) out of the setup-conda-env.sh script. I just added (what should be) all the required dependencies. Could you please try again with bash setup-conda-env.sh. But before you do that, please remove the previous fegenie conda environment with conda env remove -n fegenie.

Let me know how it goes.

Thanks, Arkadiy

JBuongio commented 2 years ago

Hi again, I believe it got farther this time. Same command as last run. Error this time:

Creating depth matrix file: /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone/Woeseia2_stnF.depth Closing most bam files Closing last bam file Finished processing... Woeseia2_stnF Traceback (most recent call last): File "/home/lloydlab/FeGenie/FeGenie.py", line 3135, in main() File "/home/lloydlab/FeGenie/FeGenie.py", line 2669, in main Dict[cell][process].append(float(depthDict[cell][contig])) ValueError: could not convert string to float: 'EMPTY'

Arkadiy-Garber commented 2 years ago

Shoot...but we're making progress down the script! Baby steps, ha.

This error actually reminds me of the first one that you had, at the top of this issue thread. Any chance you're providing a Bam_file_for_fegenie.txt file with .fasta extensions missing from the first column? If not, then something else must be going on, and if you can share one of the .depth output files, along with the .csv outputs, that would be helpful.

Thanks again, and sorry that FeGenie keeps crashing on you! Arkadiy

JBuongio commented 2 years ago

No need to apologize! I appreciate you working with me on this. Indeed, ".fasta" was missing from the first column. I am re-running now and will update.

Arkadiy-Garber commented 2 years ago

sounds good. Let me know how it goes.

JBuongio commented 2 years ago

Hi Arkadiy, So sorry it took so long. Just got back from a conference. Here are the files: https://drive.google.com/drive/folders/1TFxcrowQXBL7iRofMr-oCnNH4vwMq9NM?usp=sharing

Latest error: Creating depth matrix file: /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone/Woeseia2_stnF.fasta.depth Closing most bam files Closing last bam file Finished processing... Woeseia2_stnF.fasta Traceback (most recent call last): File "/home/lloydlab/FeGenie/FeGenie.py", line 3135, in main() File "/home/lloydlab/FeGenie/FeGenie.py", line 2671, in main outHeat = open("%s/FeGenie-%s-heatmap-data.csv" % (args.which_bams, outDirectory), "w") FileNotFoundError: [Errno 2] No such file or directory: 'average/FeGenie-/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone-heatmap-data.csv'

From command: FeGenie.py -bin_dir /home/lloydlab/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fasta_files/Indiv_Anvio_fastas -bin_ext fasta -out /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone -t 16 -bams Bam_file_for_fegenie.txt --makeplots --norm

Arkadiy-Garber commented 2 years ago

No worries, and thanks for sending the files. I can actually see what the cause of this error is from the command and error message. It is a small bug related to the last update, where I added the -which_bams flag. I just fixed and updated FeGenie with. We're getting close to the end here! Please try again and let me know if there are any other issues.

Thanks! Arkadiy

JBuongio commented 2 years ago

So close!!

Latest after removing previous conda environment, pulling down new repo, doing the conda setup, and running a successful test: (fegenie) lloydlab@barbara:~/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie$ FeGenie.py -bin_dir /home/lloydlab/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fasta_files/Indiv_Anvio_fastas -bin_ext fasta -out /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone -t 16 -bams Bam_file_for_fegenie.txt --makeplots --norm sh: 1: echo: echo: I/O error sh: 1: echo: echo: I/O error Traceback (most recent call last): File "/home/lloydlab/FeGenie/FeGenie.py", line 3135, in main() File "/home/lloydlab/FeGenie/FeGenie.py", line 512, in main bits = HMMdir + "/" + "HMM-bitcutoffs.txt" UnboundLocalError: local variable 'HMMdir' referenced before assignment

Arkadiy-Garber commented 2 years ago

Hey Joy,

Sorry for the delay! I am looking into this now. Thanks for your patience.

Arkadiy

Arkadiy-Garber commented 2 years ago

Hey! I fixed the bug that was causing this issue. This happens to be higher in the code then the previous errors, and I am not sure why that is, and why this bug was silent during your previous runs. In any case, if you can start from a fresh download of this script, there shouldn't be any other issues. But let me know if there are! Thanks again for your patience!

Thank you! Arkadiy

JBuongio commented 2 years ago

Hi Arkadiy, Thanks so much for your continuous help. It ran all the way through to the R plots, where it errored. Your annotations say not to worry. And, indeed, the .depth and .csv files look great! I should be able to figure out how to plot the outputs in R on my own. =)

Creating depth matrix file: /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone/Woeseia2_stnF.fasta.depth Closing most bam files Closing last bam file Finished processing... Woeseia2_stnF.fasta Running Rscript to generate plots. Do not be alarmed if you see Warning or Error messages from Rscript. This will not affect any of the output data that was already created. If you see plots generated, great! If not, you can plot the data as you wish on your own, or start an issue on FeGenie's GitHub repository Error in library("ggpubr", lib.loc = library.path) : there is no package called ‘ggpubr’ Execution halted Registered S3 methods overwritten by 'ggplot2': method from [.quosures rlang c.quosures rlang print.quosures rlang Error: package or namespace load failed for ‘reshape2’ in dyn.load(file, DLLpath = DLLpath, ...): unable to load shared object '/home/lloydlab/anaconda3/envs/fegenie/lib/R/library/stringi/libs/stringi.so': libicui18n.so.58: cannot open shared object file: No such file or directory Execution halted Traceback (most recent call last): File "/home/lloydlab/FeGenie/FeGenie.py", line 3141, in main() File "/home/lloydlab/FeGenie/FeGenie.py", line 2771, in main os.system("mv %s/Fegenie-dotplot.tiff %s/Fegenie-%-dotplot.tiff" % (outDirectory, outDirectory, args.which_bams)) TypeError: %d format: a number is required, not str

Arkadiy-Garber commented 2 years ago

Thanks, Joy! Glad that you got the main output files. I just fixed the final little bug on line 2771, and added the R Package ggpubr to the list of packages installed with the setup-conda-env.sh script. So if you re-do the git clone and conda installation, you should avoid these errors.

Thanks again for using FeGenie and for your patience as we deal through these bugs :)

Cheers, Arkadiy

JBuongio commented 2 years ago

Hi Arkadiy, I did a reinstall (using your easy install instructions because for some reason bash setup-conda-env.sh was hung at solving environment. Anyway, the .tiff plot was made for all of the MAGs! Now, I just need to create a graph for each of the samples (which I can do with the for loop you provided). Thank you so much for sticking with this! I owe you a coffee. Do you have a Venmo or some other way for me to send you a small token of my appreciation?

Thanks! Joy

Arkadiy-Garber commented 2 years ago

Thanks, Joy! That's so nice of you. But no token necessary. I am happy to help :) Perhaps if we're ever at a conference together, we can get together and talk some science over coffee or something.

Glad to hear that the latest install worked. Looking forward to seeing what comes from this analysis! Don't hesitate to reach out if you have any other questions.

Cheers, Arkadiy

JBuongio commented 2 years ago

Hi again! You thought you were rid of me ;) I tried the for loop and it did not work -- it ran for a long while, but then errored out. So I tried to do a single BAM, specifying the -which_bams argument (I tried both a single dash and a double dash before "which_bams"): (fegenie) lloydlab@barbara:~/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie$ FeGenie.py -bin_dir /home/lloydlab/klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fasta_files/Indiv_Anvio_fastas -bin_ext fasta -out /klds1515/SVA08.16_Metagenomes/071817KLmetagenome-45705703/Bins_from_Derep_Dastool/Fegenie/Output_redone_withgraphs_indivBAMs -t 16 -bams Bam_file_for_fegenie.txt -which_bams 1 --makeplots --norm --skip --nohup

It provided this error: Woeseia_stnF.fasta 0 X.1 NA Error in hclust(d = dist(x = fegenie.scaled)) : NA/NaN/Inf in foreign function call (arg 10) Calls: as.dendrogram -> hclust Execution halted Traceback (most recent call last): File "/home/lloydlab/anaconda3/envs/fegenie/bin/FeGenie.py", line 3141, in main() File "/home/lloydlab/anaconda3/envs/fegenie/bin/FeGenie.py", line 3092, in main os.system("mv %s/Fegenie-dotplot.tiff %s/Fegenie-%-dotplot.tiff" % (outDirectory, outDirectory, args.which_bams)) TypeError: %d format: a real number is required, not str

Thanks for your help! Best, Joy

Arkadiy-Garber commented 2 years ago

Hi Joy! Sorry for the delayed response, especially since the fix for this was literally a single letter that was missing from that line in the script - but those are sometimes the hardest bugs to catch.

I also fixed up a few other bugs to deal with the "Execution halted" errors from R, so the whole thing should be ready for the for-loop again. Let me know how it goes!

Cheers, Arkadiy

JBuongio commented 2 years ago

Hi Arkadiy, Thanks so much! I ran it yesterday and the for-loop finished without error, but I only have a few .tiff files (Fegenie-average-dotplot.tiff, Fe-dendro.tiff, and Fe-heatmap.tiff) instead of one for each argument variable used for -which_bams. But, it's all good! I'm going to try running one at a time (-which_bams 1; -which_bams 2; etc.). I'll let you know how it goes. Thanks again!!! Best, Joy

Arkadiy-Garber commented 2 years ago

Hi Joy,

I just noticed a mistake in the for-loop that I sent you. Instead of --which_bams $i, it should be -which_bams $i. So only a single dash, and that should create multiple plots, one for each BAM file column in your -bams input file. Let me know if altering that fixes the issue.

Thanks! Arkadiy

JBuongio commented 2 years ago

Hi Arkadiy, Your fix worked! I ran it last night and woke up to separate plots, all lovely and ready for me to interpret! Thank you SO MUCH. =D Best, Joy

Arkadiy-Garber commented 2 years ago

Awesome, glad that it's working for you now! Have fun diving into the data. Looking forward to reading all about this study once it's published :)

Let me know if you have other questions or issues.

Cheers, Arkadiy

DrRumble commented 2 years ago

Hi Arkadiy, I opened a similar issue months ago but never got back to you because I saw part of the problem was with my assembly and bam files, so that took me a bit to figure out (as well as finals, conferences, etc. haha.) and I couldn't find my original post so I hope its okay to add it here. Sorry for leaving you off the hook like that! but I am having a similar issue as above, I have attached my code and output but I am not sure if this is a bug or something on my end. any help would be heaven sent! FeGenie query output.txt

Thank you!!!

Arkadiy-Garber commented 2 years ago

Hi there,

Thanks again for your interest in FeGenie! I'd be happy to help out with this. Could you please send me the command that you're using and the file you are providing via the -bams flag?

Thanks, Arkadiy

DrRumble commented 2 years ago

Thank you! sorry for my delayed response, I had a conference this week and I am in the midst of moving into my first house.

I am not using a bams file at the moment because it was just a test run with one file so I used the -bam command. but this is this command:

FeGenie.py -bin_dir ~/bioinformatics/metagenomic_files/RQCF_output_3_unmerged/BFC_corrected/out.megahit.A.samples/ -bin_ext fa -out bam_fegenie_out --makeplots -t 6 -bam ~/bioinformatics/metatranscriptomics/raw_fastq_files/unmerged_rqcfiles/WD-1_S1.anqrpt.fastq.pairedMapped_sorted.bam --meta

I tried to attach the bam file too but it wouldn't let me load it here. I can email it to you if needed!

Thank you!

Arkadiy-Garber commented 2 years ago

Hi,

Thanks for that information, and congrats on the new house!

Based on this information, I would guess that the contig names in the FASTA files inside the "megahit.A.samples" directory differ from the contig names that were used to generate the BAM file. Can you please double-check the FASTA file used to generate the WD-1_S1.anqrpt.fastq.pairedMapped_sorted.bam file, and compare them to the header names in the megahit.A.samples FASTA files.

Also, if there is the pipe character '|' in header names, that may also cause problems, since FeGenie uses that character as a delimiter in some parts of the code.

Let me know if you find any differences! Otherwise, could you try running the program without the -bam argument and see if it finishes without error?

Thanks, Arkadiy