traitecoevo / taxonlookup

A versioned and dynamically updating taxonomic lookup table for land plants
http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12517/abstract
Other
31 stars 6 forks source link

duplicate 'row.names' are not allowed. when by_species = TRUE #21

Open rossmounce opened 8 years ago

rossmounce commented 8 years ago

I have a large list of genera to lookup, and I (knowingly) have lots of duplicates. I don't want it all uniq'd down to unique genera only.

If the list starts with "Aa" first and "Aa" second, I want two separate lines output e.g.

 genus      family       order       group
1 Aa Orchidaceae Asparagales Angiosperms
2 Aa Orchidaceae Asparagales Angiosperms

But instead it just throws an error and doesn't output any table. I can only get an output table with by_species=FALSE and it only has ~12,000 in it (not what I want).

str(A)
 chr [1:337444] "Aa" "Aa" "Aaronsohnia" "Narthecium" "Abarema" ...

lookup_table(A,missing_action="NA",by_species=TRUE)
Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘Aa’, ‘Abarema’, ‘Abelia’, ‘Abeliophyllum’, ‘Abelmoschus’, ‘Abies’, ‘Abrodictyum’, ‘Abroma’, ‘Abronia’, ‘Abrophyllum’, ‘Abrotanella’, ‘Abrus’, ‘Abuta’, ‘Abutilon’, ‘Acacia’, ‘Acaciella’, ‘Acaena’, ‘Acalypha’, ‘Acampe’, ‘Acamptopappus’, ‘Acanthephippium’, ‘Acanthocalycium’, ‘Acanthocalyx’, ‘Acanthocarpus’, ‘Acanthocereus’, ‘Acantholimon’, ‘Acantholippia’, ‘Acantholobivia’, ‘Acanthomintha’, ‘Acanthopale’, ‘Acanthopanax’, ‘Acanthophoenix’, ‘Acanthophyllum’, ‘Acanthoprasium’, ‘Acanthopsis’, ‘Acanthorhipsalis’, ‘Acanthorrhiza’, ‘Acanthoscyphus’, ‘Acanthosicyos’, ‘Acanthospermum’, ‘Acanthostachys’, ‘Acanthosyris’, ‘Acanthus’, ‘Acaulimalva’, ‘Acca’, ‘Acer’, ‘Aceratium’, ‘Achatocarpus’, ‘Achetaria’, ‘Achillea’, ‘Achimenes’, ‘Achlys’, ‘Achnather [... truncated] 

Is it possible to coerce it to the style of output I desire, outputting 'dumbly' for each and every input name, even if there are duplicates?

richfitz commented 8 years ago

Hi Ross. I appreciate the bug reports - but could you do us a favour and provide a minimally reproducible example for us? (i.e., 4-5 lines that we can run beginning to end to recreate your problem -- see here for more information).

I'm sure we can reverse engineer the issues you've found but it just makes it a lot easier, and therefore keeps it a little higher in the list of things to do.

rossmounce commented 8 years ago

Sure.

Small example of the feature request is below:

plant_lookup_version_current()
[1] "1.1.1"
lookup_table(c("Aa","Aa","Aaronsohnia","Narthecium","Abarema","Abarema","Abarema"),missing_action="NA")
        genus        family        order       group
1          Aa   Orchidaceae  Asparagales Angiosperms
2 Aaronsohnia    Asteraceae    Asterales Angiosperms
3  Narthecium Nartheciaceae Dioscoreales Angiosperms
4     Abarema      Fabaceae      Fabales Angiosperms

I would expect/want this output instead:

        genus        family        order       group
1          Aa   Orchidaceae  Asparagales Angiosperms
2          Aa   Orchidaceae  Asparagales Angiosperms
3 Aaronsohnia    Asteraceae    Asterales Angiosperms
4  Narthecium Nartheciaceae Dioscoreales Angiosperms
5     Abarema      Fabaceae      Fabales Angiosperms
6     Abarema      Fabaceae      Fabales Angiosperms
7     Abarema      Fabaceae      Fabales Angiosperms

As for the bug, I've uploaded a smaller set of 1000 names to a github gist to enable reproduction:

testnames <- readLines("https://gist.githubusercontent.com/rossmounce/fcac3b61324f1dcf721e/raw/5665d9de08907f6dd54898f2fbcb52a62d0279d0/A%2520list%2520of%2520plant%2520genera", warn=FALSE)
zzz <- lookup_table(testnames,missing_action="NA",by_species=TRUE)
Error in `row.names<-.data.frame`(`*tmp*`, value = value) : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘Aa’, ‘Abarema’, ‘Abelia’, ‘Abeliophyllum’, ‘Abelmoschus’, ‘Abies’, ‘Abrodictyum’, ‘Abroma’, ‘Abronia’, ‘Abrophyllum’, ‘Abrotanella’, ‘Abrus’, ‘Abuta’, ‘Abutilon’, ‘Acacia’, ‘Corynabutilon’, ‘Diabelia’