lizzieinvancouver / ospree

Budbreak review paper database
3 stars 0 forks source link

clean_duplicates was deleted #450

Open lizzieinvancouver opened 1 year ago

lizzieinvancouver commented 1 year ago

I assume unintentionally ...

lizzieinvancouver commented 1 year ago

I was fixing a small issue with spp names (issue #449) and noticed an entire cleaning file is gone. I used:

git log --full-history -- analyses/cleaning/clean_duplicates.R

to track back to here and I think this accidentally deleted by @DeirdreLoughnan here which is commit 85ae4d88863e71335d023e0ba14235dcfd25a5fd ...

lizzieinvancouver commented 1 year ago

Sorry @FaithJones I don't think you should be on this issue, as you were just deleting this file via a merge (see commit 605039e2a1da54647cfe6c1d08bdd67fa89b8ffb).

lizzieinvancouver commented 1 year ago

In commit # d2c70339fd8ebabb8bf076d4166c7c6ffc380de6 ... I added it back in by just copying and pasting from when it was delete (see here for exactly what I copied).

The previous commit before the deletion appears to be from 2017.

lizzieinvancouver commented 1 year ago

Re-run ... deletes 234 rows, which (according to our August 2019 notes) is 5 less (was 239) than before we deleted cleaning/clean_duplicates.R in Jan 2021, but the best we can do. See commit # 45ded0f1211b76522ac208b9be0aa6c60f86f287 for updated data.

lizzieinvancouver commented 1 year ago

@DeirdreLoughnan Could you review this issue and double-check my thinking that maybe a mistaken deletion you did when working on the clean_duplicates for traitors (check the commit links I have)? I know it was a while ago, but see what you think ....

Could you also do a quick comparison of ospree_clean.csv as we have it now versus the August 2020 version (you can download the old version by either grabbing/renaming what you have now before you git pull my changes) with the new version -- see here for the history of commits on this file. Thanks!

lizzieinvancouver commented 1 year ago

I am not sure how best to handle this ... we likely should rerun EVERYTHING with the updated data to be safe.... that would mean redoing everything in this file. To discuss once we hear what @DeirdreLoughnan finds.

DeirdreLoughnan commented 1 year ago

@lizzieinvancouver I am so sorry, this is a huge mistake on my part and not something I remember doing on purpose!

I will also see if I can get the original version from off of my old hard drive and compare it to the current files.

DeirdreLoughnan commented 1 year ago

@lizzieinvancouver again I am really sorry for all the trouble this mistake has caused!

I was able to get a copy of analyses/cleaning/clean_duplicates.R off of my hard drive. The oldest version I have is from January 17, 2019, long before I ever modified it.

When I source this code in the cleaning/cleanmerge_all.R file I also get 234 rows of data being removed.

I also compared different versions of the ospree_clean.csv file. The version of the file that I had previously, which for me was Oct 9, 2020 (I must not have pulled frequently back then...the version I have before this is May 13 2020), seems to be the exact same as the version that you just pushed.

I even double checked this and used the FileMerge program on my old Mac and the only differences between my old version 2020 version of ospree_clean.csv and the new is that the species epithet from Malyshev18 was changed from "pseudolatanus" to "pseudoplatanus".

is it possible that the note from August 22 2020 is wrong? Let me know how else I can help fix the mess my mistake caused!

lizzieinvancouver commented 1 year ago

I am so sorry, this is a huge mistake on my part and not something I remember doing on purpose!

@DeirdreLoughnan This is not a huge mistake at all! I can easily see how it may have happened and no matter how it happened, it could have happened to anyone. I am just skimming your other reply now as I have to run to a meeting, but it could be that the Aug 2020 note was wrong ... thanks for all your effort on this and I will look more later.

lizzieinvancouver commented 1 year ago

@DeirdreLoughnan

When I source this code in the cleaning/cleanmerge_all.R file I also get 234 rows of data being removed.

is it possible that the note from August 22 2020 is wrong?

Definitely possible! I started those notes to keep somewhat track sometimes, but we were not committed to them nor always careful about updating them, but we never put in print commands or such, so lots of room for human error.

I also compared different versions of the ospree_clean.csv file. The version of the file that I had previously, which for me was Oct 9, 2020 (I must not have pulled frequently back then...the version I have before this is May 13 2020), seems to be the exact same as the version that you just pushed.

I even double checked this and used the FileMerge program on my old Mac and the only differences between my old version 2020 version of ospree_clean.csv and the new is that the species epithet from Malyshev18 was changed from "pseudolatanus" to "pseudoplatanus".

Thank you! I just pulled an older ospree_clean versus yesterday's and compared too. I found lots of special character differences (oy, the town of As is a pain) and then the pseudoplatanus (which started all this) but nothing else. So I think we're good! We'll just have to try to update all the data someday. I will leave this open until then.

lizzieinvancouver commented 1 year ago

@cchambe12 updated the chilling files -- yay! But I am getting an error in bb_cleanmergeall.R ...

here ...

> source("bb_analysis/cleaning/clean_ambientforcingfromdailyclimate.R") # 4 Dec 11959: (still 11404)
Error in if (length(unique(clim.i$year == year.end.i & clim.i$doy == doy.end.i)) ==  : 
  the condition has length > 1

In clean_ambientforcingfromdailyclimate.R I am fairly sure it's happening here:

else if (length(unique(clim.i$year==year.end.i & clim.i$doy==doy.end.i))==1 & unique(clim.i$year==year.end.i & clim.i$doy==doy.end.i)==FALSE){

at i=49 ... but I cannot track down why exactly. @cchambe12 if you have any ideas, let me know.

cchambe12 commented 1 year ago

@lizzieinvancouver For some reason I didn't get an error... which feels weird to me! I pushed the new cleaned ospree data to the repo and am doing some checks now but I think it looks okay

Yes! Seems okay!