DatSciR / intro_prog_fun

Introducción a la programación funcional
23 stars 0 forks source link

mejoras #4

Closed Julenasti closed 2 months ago

Julenasti commented 7 months ago

creo que este ejemplo se podría actualizar utilizando el nuevo shortcut the purrr (x) DONE https://github.com/Julenasti/intro_prog_fun/blob/fd886c0fb9862b0a50cbe1b88255a9c6c71d0a5a/intro_prog_fun.qmd#L221

creo que aquí se lo podría dar un nombre más explicativo a la función DONE https://github.com/Julenasti/intro_prog_fun/blob/1dce3c43699c104d78fb6026818d5e36abb44156/intro_prog_fun.qmd#L353

Julenasti commented 7 months ago
Julenasti commented 6 months ago

@VeruGHub he revisado tus updates. Yo creo que está genial!! Te he añadido unas cuantas sugerencias. Busca la palabra julen para encontrarlas. Gracias!! https://github.com/Julenasti/intro_prog_fun/commit/180c242e9d11d96422466db9ee208a575d00880e

VeruGHub commented 6 months ago

He simplificado el apartado de cómo escribir una función, quitando la parte de if else. Creo que para este curso era demasiado complejo. Ya me dirás que te parece si te da tiempo para el miercoles

Julenasti commented 6 months ago
VeruGHub commented 6 months ago
Julenasti commented 2 months ago

@VeruGHub intentaría enviarles los apuntes un poco antes del jueves que viene aunque luego cambiemos alguna cosilla para que puedan por lo menos leer una vez los apuntes antes de iniciar el curso porque en el último curso una sugerencia fue justo esta. Principalmente en mi parte que hay bastante teoría. Qué te parece?

mi parte ya está revisada incluyendo los cambios de transmitting y las mejoras arriba indicadas. Te he indicado con ja: unos pocos comentarios


Esto no lo he puesto en los apuntes pero puesto que lo preguntas en el comentario de arriba ahí va un poco de info de por qué es raro iterar sobre filas en purrr:

Iterating over rows using the map function from the purrr package in R feels more awkward compared to iterating over columns for a few reasons:

  1. Column-major storage of data frames R stores data frames in a column-major format, meaning that each column is essentially a list, and all elements of a column are contiguous in memory. Therefore, it's natural to iterate over columns using list-based functions like those in purrr (e.g., map(), map_df(), etc.), which are optimized for operating over lists (or column-like structures).

When you try to iterate over rows, each row is composed of elements from multiple columns, making it less natural to use these list-based functions. Extracting a row from a data frame requires reformatting or transposing the data, which can feel more cumbersome.

  1. Data Frame Structure Columns of a data frame are naturally treated as vectors or lists, which makes them compatible with functions like map() that are designed to iterate over list-like objects. Rows in a data frame are conceptually more heterogeneous: each element within a row can belong to a different type (e.g., numeric, character, etc.), whereas columns are typically homogenous (i.e., all elements are of the same type). This discrepancy makes it feel less intuitive to use map() for row-wise iteration because you're applying a function to heterogeneous data.

  2. Column-wise functions are direct Using map() to apply functions to columns is straightforward because the data structure (a column) aligns perfectly with the expected input of list-like functions. For example, if you want to apply a function to each column, you simply write:

library(purrr)
map(df, some_function)

But if you want to apply a function row-wise, you have to first restructure the data, often by transposing or converting the rows into a list format:

df %>% 
  transpose() %>% 
  map(some_function)

This extra step of transposing or transforming rows into a list adds an additional layer of complexity and makes the operation feel less natural.

  1. Alternative tools for row-wise operations For row-wise operations, other tools such as apply() (from base R) or rowwise() (from dplyr) are often more straightforward and commonly used than map(). apply() allows you to do so because it is specialised to work with two-dimensional and higher vectors, i.e. matrices and arrays. You can think of apply() as an operation that summarises a matrix or array by collapsing each row or column to a single value. It has more of a mathematical or statistical flavour, and it is generally less useful in data analysis (https://adv-r.hadley.nz/functionals.html, see 9.7 Base functionals). For example:
# Using apply for row-wise operations
apply(df, 1, some_function)

# Using dplyr's rowwise
df %>% rowwise() %>% mutate(new_col = some_function(...))

This means that while column-wise operations with map() are natural in purrr, row-wise operations tend to have more established alternatives, making it feel less intuitive to force row-wise iteration into the purrr framework.

  1. Row extraction results in named vectors When extracting rows from a data frame, R often returns named vectors (where the names correspond to the column names). This adds another layer of complexity because your function will have to deal with a vector that has names, which may or may not be necessary, whereas column-wise extraction provides a simpler vector.

Summary

VeruGHub commented 2 months ago

Okis, gracias por la info. Sólo decía que se mencionara porque en R base se usa porque no todo es tan tidy y por si alguien lo necesita. No me salen las imágenes que has subido tu en la carpeta de imagenes. A ti te salen las mias? He incluido ya mi parte. He quitado la parte de programación en R base pero he dejado la programación básica de tidyverse. Si vemos que ya saben tidyverse me lo puedo saltar. También he revisados tus comentarios. Me parece bien enviarles los apuntes antes. Para comentar:

Julenasti commented 2 months ago

He hecho una revisión general y subido a github. No tengo ningún comenatario más de tu parte. También he revisado los ejercicios resueltos y todo OK. Por mi parte ya estaría! 🥳