CornellLabofOrnithology / ebird-best-practices

Best Practices for Using eBird Data
https://CornellLabOfOrnithology.github.io/ebird-best-practices/
Other
33 stars 13 forks source link

can't convert <double> to <list> when transforming to wide format #3

Closed ormaillet closed 4 years ago

ormaillet commented 4 years ago

I've run into an issue when trying to convert pland to the wide format. It had worked for me before while working with a single BCR (13) but now I have refiltered the dataset to include BCR 12 and 13 - could this be the issue? This is the code I'm running before I get the error message:

# tranform to wide format, filling in implicit missing values with 0s%>% 
pland <- pland %>% 
  pivot_wider(names_from = lc_name, 
              values_from = pland, 
              values_fill = list(pland = 0))

Then the error message:

Error: Can't convert <double> to <list>.
Run `rlang::last_error()` to see where the error occurred.

Then:

rlang::last_error() returned

Error: Internal error: Trace data is not square.
Run `rlang::last_error()` to see where the error occurred.

rlang::last_error()

> #Trace data is not square
> rlang::last_error()
<error/rlang_error>
Internal error: Trace data is not square.
Backtrace:
  1. (function (x, ...) ...
  2. rlang:::print.rlang_error(x)
 10. rlang:::format.rlang_error(x, simplify = simplify, fields = fields)
 12. rlang:::format.rlang_trace(trace, ..., simplify = simplify)
 13. rlang:::trace_format_branch(x, max_frames, dir, srcrefs)
 14. rlang:::branch_uncollapse_pipe(trace)
Run `rlang::last_trace()` to see the full context.

Everything up until then had worked and given me the same "looking" list form "pland", ie: 1 2012 L294957 0.31034483 pland_00_water 2 2012 L1516757 0.32258065 pland_00_water 3 2012 L1498674 0.30000000 pland_00_water......

Thanks in advance!

Olivia

mstrimas commented 4 years ago

I'm not sure what's going on here, there's no reason it shouldn't work for multiple BCRs.There must be something different about the data frame that's causing pivot_wider() to fail. I'd suggest creating a small reproducible example, e.g. look at a single year and location. My guess is that in doing so you'll solve your own problem, and if not post the reproducible example here and I'll see what I can do

ormaillet commented 4 years ago

Thanks for the advice, creating a small reproducible example did reveal what I believe is the issue. Using it instead returned the following error message:

Error: Must subset columns with a valid subscript vector.
x Subscript has the wrong type `tbl_df<
  year       : character
  locality_id: character
  pland      : double
  lc_name    : character
>`.
i It must be numeric or character.

I'm still not sure what to do here!

Olivia

mstrimas commented 4 years ago

Can you paste the reproducible example here, so I can run it?

ormaillet commented 4 years ago

Session info:

R version 4.0.2 (2020-06-22)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)
Matrix products: default
locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252    LC_MONETARY=English_Canada.1252
[4] LC_NUMERIC=C                    LC_TIME=English_Canada.1252    
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
other attached packages:
 [1] viridis_0.5.1       viridisLite_0.3.0   exactextractr_0.4.0 sf_0.9-4            rnaturalearth_0.1.0
 [6] forcats_0.5.0       stringr_1.4.0       purrr_0.3.4         readr_1.3.1         tidyr_1.1.0        
[11] tibble_3.0.1        ggplot2_3.3.2       tidyverse_1.3.0     gridExtra_2.3       lubridate_1.7.9    
[16] auk_0.4.1           remotes_2.1.1       dplyr_1.0.0         MODIS_1.2.2         raster_3.3-7       
[21] sp_1.4-2            mapdata_2.3.0       maps_3.3.0         

This is the part that is not working:

# tranform to wide format, filling in implicit missing values with 0s%>% 
sre_pland <- sre_pland %>% 
  pivot_wider(names_from = lc_name, 
              values_from = sre_pland, 
              values_fill = list(sre_pland = 0))

Reproducible example

structure(list(year = c("2012", "2012", "2012", "2012", "2012"
), locality_id = c("L1532473", "L1532473", "L1532473", "L1532473", 
"L1532473"), pland = c(0.233333333333333, 0.1, 0.266666666666667, 
0.0333333333333333, 0.366666666666667), lc_name = c("pland_00_water", 
"pland_08_woody_savanna", "pland_09_savanna", "pland_10_grassland", 
"pland_13_urban")), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

I hope I've done this correctly - if not please let me know and I will fix it! I'm new to this.

mstrimas commented 4 years ago

That does the trick, thanks! The issue appears to be that you're using the wrong column names for pland. This works for me:

sre_pland %>% 
  pivot_wider(names_from = lc_name, 
              values_from = pland, 
              values_fill = list(pland = 0))
mstrimas commented 4 years ago

Hmm, looking back at the original issue, I don't think this is what's causing it. This may be something new introduced when you tried to create the reproducible example. Let me know.

ormaillet commented 4 years ago

Yes, you are right...this code worked for the reproducible example just fine. It transformed it to wide format without an issue.

But doing it again for the actual set I had taken the example from it is still giving me the same error message

pland %>% 
  pivot_wider(names_from = lc_name, 
              values_from = pland, 
              values_fill = list(pland = 0))

Error: Can't convert <double> to <list>.

To create the reproducible example, I had filtered the data set (pland) to just one specific locality_ID (L1532473) and one year (2012)...so I'm confused as to why it worked for the example, but not for the whole pland dataset. Also, I just created another example with two different locality_IDs and two different years, and it transformed it to wide format again without an issue. So could the problem possibly be size? My pland is 619024 obs. of 4 variables. Or is there some other problem that would cause them not to line up correctly? Just speculating.

Thanks!

mstrimas commented 4 years ago

My guess is there's some duplication in your data that's causing this. For example, I think this recreates your error:

rbind(sre_pland, sre_pland) %>% 
  tidyr::pivot_wider(names_from = lc_name, 
              values_from = pland, 
              values_fill = list(pland = 0))

So I'd look for duplicate.

ormaillet commented 4 years ago

You are right! There was some duplication. I'm still unsure of how it got there in the first place as it wasn't obvious, but I was able to remove it and the transform function worked.

lime-n commented 4 years ago

I receive the same error, I tried

pland <- pland %>% distinct %>%
  pivot_wider(names_from = lc_name, 
              values_from = pland, 
              values_fill = list(pland = 0))

To get rid of the error.

However, when you get to the occupancy chapter, and onto occu-wide formatting, this will eventually return the error:

Error in format_unmarked_occu(occ, site_id = "site", response = "species_observed",  : 
  Site-level covariates must be constant across sites

If you can somehow get the above code to work without any error, then the occupancy code will work.

AltheaWang commented 3 months ago

I have encounter the same problem and I have no idea how to solve it. I have checked the data again and again, I am so upset. Here is the data abstract and my code. In this part, I want to see the association of thee groups, so there have same ids in the ID part. please help me thanks.

Groupline_uni <- read_xlsx("/Users/data_analysis.xlsx") print(head(Groupline_uni)) print(str(Groupline_uni)) Groupline_uni$group3 <- factor(Groupline_uni$group3) print(levels(Groupline_uni$group3))

upset_data <- Groupline_uni %>% select(ID, group3) %>% mutate(value = 1) %>%
pivot_wider(names_from = group3, values_from = value, values_fill = list(value = 0))

when I program the front part: upset_data <- Groupline_uni %>% select(ID, group3) %>% mutate(value = 1)

I got: tibble [254 × 3] (S3: tbl_df/tbl/data.frame) $ ID : chr [1:254] "P803217" "P826202" "P826202" "P647286" ... $ group3: Factor w/ 3 levels "post-first","post-n",..: 3 3 1 2 3 1 1 1 2 1 ... $ value : num [1:254] 1 1 1 1 1 1 1 1 1 1 ...

but when the whole was programmed: it would be an error like: Error in pivot_wider(): ! Can't convert fill to . Run rlang::last_trace() to see where the error occurred.

Please help me thanks.

mstrimas commented 3 months ago

Hi @AltheaWang, I'm sorry to hear you're having trouble. Can you share the data so I can try to reproduce your error?

AltheaWang commented 3 months ago

Hey Matt,

Thanks for your reply. I apologize for my late response because I haven't check this box for a while. I have alreay resolved previous issue by reviewing my data frame. However, I encountered another problem: "Error in start_col:end_col : argument of length 0. I'm uncertain why I am facing this error, as I cannot identify any blanks between the start column and end column in my data.

I would greatly appreciate it if you could take some time to look over this issue and provide any feedback or insights you may have.

Thank you for your assistance! Have a great summer time!

Meng-Min

At 2024-08-06 04:04:07, "Matt Strimas-Mackey" @.***> wrote:

Hi @AltheaWang, I'm sorry to hear you're having trouble. Can you share the data so I can try to reproduce your error?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***> rm(list = ls()) library(readxl) library(UpSetR) library(tidyverse)

data_c <- read_xlsx("/Users/Desktop/Upset_plot.xlsx") View(data_c)

clinical_data_filtered <- data_c %>% filter(p_type %in% c(1, 3))

clinical_data_uni <- clinical_data_filtered print(head(clinical_data_uni <- clinical_data_filtered)) print(str(clinical_data_uni <- clinical_data_filtered))

clinical_data_uni$group3 <- factor(clinical_data_uni$group3) print(levels(clinical_data_uni$group3))

upset_data <- clinical_data_uni %>% select(Tumor_Sample_Barcode, group3) %>% mutate(value = 1)%>% pivot_wider(names_from = group3, values_from = value, values_fill = list(value = 0))

upset_data_new <- upset_data %>% column_to_rownames(var = "Tumor_Sample_Barcode") %>% mutate(across(everything(), ~ . > 0))

print(head(upset_data_new))

upset(upset_data_new, sets = colnames(upset_data_new), main.bar.color = "#56B4E9", sets.bar.color = "#009E73", keep.order = TRUE, order.by = "freq")

mstrimas commented 3 months ago

Hi @AltheaWang I be happy to look into your issue, but I will need to see the data and code that produces the problem so I can recreate the error.

AltheaWang commented 3 months ago

Yes ,so I attached the files in the email,can you see them ?

---- Replied Message ---- | From | Matt @.> | | Date | 08/18/2024 07:52 | | To | CornellLabofOrnithology/ebird-best-practices @.> | | Cc | AltheaWang @.>, Mention @.> | | Subject | Re: [CornellLabofOrnithology/ebird-best-practices] can't convert to when transforming to wide format (#3) |

Hi @AltheaWang I be happy to look into your issue, but I will need to see the data and code that produces the problem so I can recreate the error.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

AltheaWang commented 3 months ago

Of course. Plz wait a minute:)))

---- Replied Message ---- | From | Matt @.> | | Date | 08/18/2024 08:05 | | To | CornellLabofOrnithology/ebird-best-practices @.> | | Cc | AltheaWang @.>, Mention @.> | | Subject | Re: [CornellLabofOrnithology/ebird-best-practices] can't convert to when transforming to wide format (#3) |

they didn't come through, can you email to @.***?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

mstrimas commented 3 months ago

I received your code and data, but can't provide any guidance. It doesn't appear you are working with eBird data at all and your error is with the UpSetR package, which I've never used. I think you've posted this in the wrong location, I would suggest posting the error at https://github.com/hms-dbmi/UpSetR instead.

AltheaWang commented 3 months ago

Thanks for your kind help. Yes, I didn't use eBird data, and thanks for your advice, I would like to search the website you provided with me. Thanks so much!)

Have a nice day.

Best, Althea

At 2024-08-18 08:33:09, "Matt Strimas-Mackey" @.***> wrote:

I received your code and data, but can't provide any guidance. It doesn't appear you are working with eBird data at all and your error is with the UpSetR package, which I've never used. I think you've posted this in the wrong location, I would suggest posting the error at https://github.com/hms-dbmi/UpSetR instead.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>