leffj / mctoolsr

Microbial community analysis tools in R
http://leffj.github.io/mctoolsr/
20 stars 8 forks source link

The condition has length > 1 and only the first element will be used (load_taxa_table) #34

Closed jarrodscott closed 3 years ago

jarrodscott commented 3 years ago

Hi, I am running the function load_taxa_table using tab delimited taxa and metadata tables and I get the following:

the condition has length > 1 and only the first element will be used

My tables are formatted (to the best of my knowledge) just like the example files provided with the mctoolsr package. When I run the command using these files, like so input <- load_taxa_table("fruits_veggies_taxa_table_wTax.txt", "fruits_veggies_metadata.txt") I do not get this error, so something is wrong with my input files. I cannot for the life of me figure out what the problem is.

Any ideas? Thanks

hhollandmoritz commented 3 years ago

Hi Jarrod,

Often this issue comes up if there isn't a "#" symbol in front of the first line of the taxa table. This quirk is a legacy of the way that Qiime OTU tables were formatted when the mctoolsr package was first written.

Hope this helps! Hannah

jarrodscott commented 3 years ago

Hi Hannah @hhollandmoritz

Thanks for the fast reply :)

Both of my files have a # at the beginning of the first line. I attached them just in case.

Best Jarrod tmp_otu_tax.txt tmp_md.txt

hhollandmoritz commented 3 years ago

Hi Jarrod (@jarrodscott),

Thanks for attaching those files, they were very useful! I was able to load in the samples successfully on my own computer without an issue so it seems like it might be something that is specific to either your code or your installation and set up. Hopefully it is your code as that will be a much easier problem to solve. This is the exact code I used to load your samples:

library(mctoolsr)

tax_table_fp = "~/Downloads/tmp_otu_tax.txt"
map_fp = "~/Downloads/tmp_md.txt"
input = load_taxa_table(tax_table_fp, map_fp)

Let me know if you still have issues after using that code and we can try to troubleshoot what might be going on with your setup.

Cheers, Hannah

(As a post-script - there have been a bunch of installation-related issues recently with mctoolsr, so I wouldn't be at all surprised if it's not a code issue at all and there is something wrong with your installation).

jarrodscott commented 3 years ago

Hi Hannah,

Thank you, thank you! It is great that the files worked for you---valuable information. I tried your code and got the same message. I also tried running the code in R Console (I was using RStudio before) and got a slightly more informative message:

15 samples loaded
Warning message:
In if (class(tmp) == "list") { :
  the condition has length > 1 and only the first element will be used

So it seems like this is maybe not a problem as much as a warning. I also tried reinstalling the dev version and got this message:

Skipping install of 'mctoolsr' from a github remote, the SHA1 (a7b17cdf) has not changed since last install. Useforce = TRUEto force installation

So my install is up to date. Maybe I will try a complete reinstall...

hhollandmoritz commented 3 years ago

I think it's worth trying a full reinstall eventually. But if you're pressed for time, for now you can check the input$data_loaded object and the input$map_loaded object to make sure that they look the way your taxa table and mapping file should.

From what I could tell, the length warning comes from an if-then statement that's been fed an invalid argument. There are a lot of if-then statements in the load_taxa_table(), so presumably one of the functions that is being called by load_taxa_table() is acting unexpectedly and returning something odd to the if-then statements in load_taxa_table(). Since I don't get this warning, it is probably an issue with one of the dependencies. In my experience, errors like this tend to be due to issues with updates to tidyverse functions (which mctoolsr depends on).

I'll keep looking around for leads in that area, but if your input object is looking okay, then you should be able to continue, although, obviously solving the problem would be a more ideal solution.

jarrodscott commented 3 years ago

Great, thanks! My data looks fine when I load it. The strange thing is that the command works fine (no warning) with the example data from the package. For examples, if I run...

tmp_input <- load_taxa_table("fruits_veggies_taxa_table_wTax.txt", "fruits_veggies_metadata.txt") I get 32 samples loaded, no warnings. It also works fine with data from this issue (#17). So I am thinking the problem is due at least in part, to my data (but of course my data worked for you :). Thanks again for all of your help Hannah!

jarrodscott commented 3 years ago

I finally figured out what the problem is on my end. For unclassified taxonomic ranks, I was using the next highest rank call with the rank as a prefix. For example, if I had an OTU that was not classified below Class I had the lineage like so:

Bacteria;Proteobacteria;Gammaproteobacteria;c_Gammaproteobacteria;c_Gammaproteobacteria;c_Gammaproteobacteria

Where I substitute c_Gammaproteobacteria for all unclassified ranks. If I remove those and have this instead:

Bacteria;Proteobacteria;Gammaproteobacteria;;; it works fine. Still no explanation why it works on your setup @hhollandmoritz but at least I figured out why I was getting that warning. Thanks for your help, I will close for now.

hhollandmoritz commented 3 years ago

Good to know! Thanks for reporting back!

-- Hannah Holland-Moritz Doctoral Candidate, Fierer Lab EBIO Department University of Colorado Boulder she/her pronouns hhollandmoritz@gmail.com https://hhollandmoritz.github.io

I may email you outside your working hours. Please don't feel pressure to respond until your working hours. https://secure-web.cisco.com/1nTIVc1yBgvWcr9ZPOBTASp9_B57lnQ2aj5sai1lo304Yz6TAYV1KgaIjy8mdd6ecxxkAuyveefGtuo9q-x_tPBIQou65oUF1ejWVINiDhLx1aoll_9TM-3YMx6GtsWx931K0aQIW1RC6R_C-kG7snK823Jq9eiPb_UJMfiq2WSEql7gVYJNpN1upiP5-BM5X7eR0rXGhQm_s0F9dbvfiNpTzXlIaFX3s2MqXFP7TXHpdPziIhb4_M4GWqaqWzSfmqV7CqeEEr5ure6eRjV6oVFqKBRHTIZV_5CotcOyf0zbbnoyPuJ3KAOyJ_gdGqLWoH1BtS5Ezvhvp13K32T-sfpYcBee6pJcdFI7RtfL3223-lSBJswahOQtYVgKxyhyRAN8rUw3FckK6se-HGVppDOI0jBaWAj92AQgUJgsGIO9fgK8vOJk0BMY8_65FW55B2rnNMrPH41SUusHSZvOPzw/https%3A%2F%2Fgotguts.org

On Fri, Dec 11, 2020 at 6:33 AM Jarrod notifications@github.com wrote:

I finally figured out what the problem is on my end. For unclassified taxonomic ranks, I was using the next highest rank call with the rank as a prefix. For example, if I had an OTU that was not classified below Class I had the lineage like so:

Bacteria;Proteobacteria;Gammaproteobacteria;c_Gammaproteobacteria;c_Gammaproteobacteria;c_Gammaproteobacteria

Where I substitute c_Gammaproteobacteria for all unclassified ranks. If instead I remove those and have this instead:

Bacteria;Proteobacteria;Gammaproteobacteria;;; it works fine. Still no explanation why it work on your setup @hhollandmoritz https://github.com/hhollandmoritz but at least I figured out why I was getting that warning. Thanks for your help, I will close for now.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/leffj/mctoolsr/issues/34#issuecomment-743194662, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB4MVPCXBV6DXIRRM5UQQ3TSUINRLANCNFSM4UVNXLSQ .