lizzieinvancouver / egret

2 stars 0 forks source link

some tweaks to oegres.xlsx and related instructions #4

Closed lizzieinvancouver closed 1 month ago

lizzieinvancouver commented 2 years ago

These are all relatively small tweaks, but would be good to make in the main oegres.xlsx file, then ask people to update their files similarly (update the meta tab, add a notes column, switch to carefully noting rejected papers).

lizzieinvancouver commented 2 years ago

A couple more!

DeirdreLoughnan commented 2 years ago

@lizzieinvancouver I have amended the excel file to address your comments. Let me know if you see any other changes that should be made.

We also decided on the following:

  1. a new column called “germ.time.zero” to account for studies that do not measure the day of germination relative to when they started the experiment or finished the chilling. For example, they may have recorded germination since the day of first germinate emergence. This column is for a categorical variable with several of the most common responses listed in the meta_general tab. If we find that a lot of studies are measuring germination relative to another time point, we might add a second column. So let us know if you record this often.
  2. For now, we will not be doing papers in languages we are not fluent in (but if you are able to read a paper not in English, feel free to do so). Again add that the paper is not in english to the language tab and that you are rejecting it for this reason.
  3. We do not need to scrape data for red light for example or difficult to scrape variables on the log scale.
  4. We are possibly interested in data on seed mass, but for now we have just added a column, “seed.mass.given” in which you will denote Y (yes) or N (no) for whether the values are given in the paper. This will allow us to scrape that data later if we need it.
  5. If the latitude and longitude of the site is not given, you do not have look it up, but do try to be as specific as possible for the source.population column.
  6. If the duration of a treatment is vague, and given as a few hours, several, hours, or overnight - just write the wording used in the paper in the column and make a note in the notes column that this is all it said.
  7. We have decided that filling out the treatment column is not critical, although some of us may find it useful to still include it. In the latest version of the oegres file, I have coloured the column names orange for each column that we definitely need data entered for. The unfilled columns are ones we would love to include, but could still do the analysis without.
DeirdreLoughnan commented 2 years ago

I will follow up with Dinara to clarify how the ILL process worked and we can discuss at the Monday meeting whether it is worth having everyone do their own ILL or let the accumulate for us to do later (if it is not time sensitive).

lizzieinvancouver commented 2 years ago

@DeirdreLoughnan This is great -- thanks! I haven't looked at the Excel file, but will aim to next week when I hope to do data entry. One query:

We do not need to scrape data for red light for example or difficult to scrape variables on the log scale.

What do you mean by 'difficult to scrape variables on the log scale'? This could introduce a bias in our data if we do it often (by not including responses so extreme they are reported on log scale).... but it could be reasonable if we cannot accurately scrape the data. Can you give me an example? And we should make sure we're excluding for this reason very rarely.

DeirdreLoughnan commented 2 years ago

As we have gone on in the data scraping I have fielded a lot of questions regarding the error bars, the following is my suggestions:

  1. If it does not specify if error is SE or SD, still scrape it, but have "not.specified" as the the error type
  2. If the figure has error bars but they are indistinguishable from the data point, enter error as 0
  3. If the error bars are indistinguishable from other error bars, do enter the type.of.error, but in the column for the value put "indistinguishable"
toluam commented 2 years ago

Screen Shot 2022-09-06 at 4 44 33 PM Deirdre and I met and we decided for figures like the above, we should not to take measurements of the error bars that are indistinguishable and we decided to not try to extract points as seen for the black circle data because they are no longer visible past 18 weeks.

lizzieinvancouver commented 2 years ago

Deirdre and I met and we decided for figures like the above

That sounds like a very reasonable decision!

lizzieinvancouver commented 2 months ago

@DeirdreLoughnan If we can point to a methods file or similar in this issue, then we could close it!

lizzieinvancouver commented 1 month ago

One last query answered: What do people do if they cannot get their paper at UBC (or are all assigned paper available)? They got them from ILL or ResearchGate, we tried to get ALL English-language papers.