BIOL548O / Discussion

A repository for course discussion in BIOL548O
0 stars 0 forks source link

Finding data for victorchks #6

Closed aammd closed 8 years ago

aammd commented 8 years ago

Hello @victorchks,

In the pre-course survey, you indicated that you don't have any data. Experience with real data is more useful to you than working through exercises with your own messy data. Therefore, I wanted to have a discussion with you about what kind of data you might be able to use.

Here are some suggestions (ranked very loosely from most to least helpful):

If none of the above datasets work, we'll have to find something else. This could be a part of the Kaggle dataset, the candy dataset, or some other random dataset you enjoy. I could also give you some of my own data which are still unpublished.

Which of these options appeals most to you? Let's try to figure this out by Thursday.

Andrew

victorchks commented 8 years ago

Hi Andrew,

I have data from a collaborating grad student, so I think I'm good for data; it's a really messy set of data, but hopefully it'll be helpful.

On a side note, I think I've come down with the flu, and may not be able to attend today's class. If I'm not able to attend, will you be posting up the day's lesson and instructions as usual, and will you have office hours if we have any questions?

Thanks!

Vic

On Mon, Feb 8, 2016 at 10:59 PM, Andrew MacDonald notifications@github.com wrote:

Hello @victorchks https://github.com/victorchks,

In the pre-course survey, you indicated that you don't have any data. Experience with real data is more useful to you than working through exercises with your own messy data. Therefore, I wanted to have a discussion with you about what kind of data you might be able to use.

Here are some suggestions (ranked very loosely from most to least helpful):

  • Data from a pilot experiment
  • Data from your supervisor / collaborator / senior grad student (assure them that these data will remain confidential and unpublished)
  • Data from any previous degree you've done (Honours / Masters). This would be especially useful to you if the data are unpublished (even if the paper is published)
  • Published data from a paper in your field
  • Data emailed to you from an author in your field
  • Data (relevant to your work) obtained from the Web -- ie from a data archive or elsewhere. If you know where to look, i can help you get it.

Plan B

If none of the above datasets work, we'll have to find something else. This could be a part of the Kaggle dataset, the candy dataset, or some other random dataset you enjoy. I could also give you some of my own data which are still unpublished.

Which of these options appeals most to you? Let's try to figure this out by Thursday.

Andrew

— Reply to this email directly or view it on GitHub https://github.com/BIOL548O/Discussion/issues/6.

Victor Chan Assisting Cub Akela | 32nd Richmond Scout Group | http://32ndrichmondscouts.wix.com/beprepared Head of Operations | 180th Pacific Coast Rover Crew | www.pccrovers.com M.Sc Candidate | UBC Zoology 778.829.2373 | victorchan303@gmail.com | victor.chan@pccrovers.com | vchan@zoology.ubc.ca

aammd commented 8 years ago

The messier the better! working with messy data is a great way to train your data-managing skills