Closed jinshijian closed 4 years ago
I can't easily see the diff here–too many changes. Can you provide a summary or mapping of the changes?
This reduces the number of unique Manipulation
strings from 689 to 293–a big improvement.
There are 11 Manipulation
strings that get mapped to more than one new string; we should look carefully at these.
> filter(mapping, different_new > 1)
# A tibble: 11 x 3
Manipulation different_new new_strings
<fct> <int> <chr>
1 Extra litter 2 Extra litter, Litter manipulation
2 Fertilized, irrigation 2 Fertilized, irrigation, Fertilized, irrigated
3 Harvest 2 Litter manipulation, Harvest
4 Herbivore exclusion 2 Herbivore exclusion, None
5 Inter-canopy 2 None, Burned
6 Mineral 2 Contaminant, Fertilized
7 None 2 None, Weed control
8 sewage sludges 2 None, Fertilized
9 Stem wood harvest 2 Litter manipulation, Harvest
10 Thinned, double litter 2 Litter manipulation, Thinned, litter manipulation
11 Under-canopy 2 None, Burned
So for example
> x %>% left_join(x_branch) %>% filter(Manipulation=="Extra litter")
Joining, by = c("Record_number", "Entry_date", "Study_number")
Record_number Entry_date Study_number Manipulation Manipulation_level new_manipulation new_man_level
1 5962 2017-02-06 9563 Extra litter All litter Extra litter All litter
2 5963 2017-02-06 9563 Extra litter All litter Extra litter All litter
3 5964 2017-02-06 9563 Extra litter All litter Extra litter All litter
4 5965 2017-02-06 9563 Extra litter S. superba litter Extra litter S. superba litter
5 5966 2017-02-06 9563 Extra litter S. superba litter Extra litter S. superba litter
6 5967 2017-02-06 9563 Extra litter S. superba litter Extra litter S. superba litter
7 5968 2017-02-06 9563 Extra litter O. pinnata litter Extra litter O. pinnata litter
8 5969 2017-02-06 9563 Extra litter O. pinnata litter Extra litter O. pinnata litter
9 5970 2017-02-06 9563 Extra litter O. pinnata litter Extra litter O. pinnata litter
10 5973 2017-02-08 8917 Extra litter Extra litter
11 5976 2017-02-08 8917 Extra litter Extra litter
12 5979 2017-02-08 8917 Extra litter Extra litter
13 8025 2020-01-14 8376 Extra litter Litter manipulation Extra litter
14 8026 2020-01-14 8376 Extra litter Litter manipulation Extra litter
More "Extra litter" gets mapped to "Extra litter", but a couple to "Litter manipulation".
@jinshijian Here's a quick combined file that may make it easy to filter and see what's being mapped to what.
Cool! Great check. I went back and checked, with some changes, but for No 4: Herbivore exclusion, one is control (because level is none, so this should be change to None and Herbivore exclusion); No 5: Inter-canopy is not a manipulation, it has two-level (None and Burn) so the manipulation should be changed to None and Burned; No 7: for the record number 9851, the manipulation level is weed-free, so the manipulation should be changed to Weed control rather than None; No 8: record number 9223, sewage sludges manipulation level is none, it should be change to None (it is the control); No 11: Under-canopy is not a manipulation, and the level is None and Burn, so it should be changed to None and Burned.
I have updated the pull request, please check. Thanks!
Looks a lot better–thanks for your work on this! Down to 290 Manipulation
categories. As you note above:
> filter(mapping, different_new > 1)
# A tibble: 6 x 3
Manipulation different_new new_strings
<fct> <int> <chr>
1 Herbivore exclusion 2 Herbivore exclusion, None
2 Inter-canopy 2 None, Burned
3 Mineral 2 Harvest, Fertilized
4 None 2 None, Weed control
5 sewage sludges 2 None, Fertilized
6 Under-canopy 2 None, Burned
Is there anything else we want to do before merging this?
Hello Ben, I double-checked the Mineral one, in study 6534, they are testing harvest's effect, and in 10795, they are testing fertilization. So I think it is right the old "Mineral" goes to two different Manipulation. So there is nothing need to do. THanks
Hello @bpbond, I did a bunch of work trying to standardize the manipulation. Basically I used the same terminology for the same kind of treatment (e.g., burned for fire, burning, burn, burnt). There are still some room for improvement, will take a look at it later. Thanks