data-cleaning-pipeline Search Results

sfbrigade/datasci-earthquake #12

Data Cleaning Pipeline

## Context We are hoping to automatically ingest our datasets in from sources (when possible and appropriate). This task is to do data quality validation to identify existing issues, and handle an…

oscarsyu updated 1 month ago

Open-Data-Tallahassee/vision-zero #14

Develop data cleaning pipeline

**Goal**: develop a series of scripts to: 1. fetch the data from FLHSMV (received in the form of CDs) 2. filter for events within Leon County 3. remove rows with missing lat/lon (and send that sub-…

shelbygreen updated 2 years ago

oss-slu/Enhancing-Bioinformatics-Research-through-LLM #3

Develop a simple data preprocessing pipeline for a specific …

Create a basic data preprocessing pipeline for a specific bioinformatics dataset to prepare it for LLM training. The pipeline should include steps for data cleaning, tokenization, and formatting

AjithAkuthota23 updated 1 month ago

g4he/g4he #57

- Incoming organisation names should be normalised (e.g. organisation names should have all words capitalised except some known stop words) - lookups in canonical list of known name variants, and swit…

richard-jones updated 11 years ago

NeuroTechX/moabb #193

Utility for testing EEG data-cleaning pipelines?

Hi there, Just stumbled upon this project: looks super-useful for reproducible science! I'm currently a collaborator on [PyPREP](https://github.com/sappelhoff/pyprep) (an MNE-Python reimplementatio…

a-hurst updated 5 months ago

Cocoon-Data-Transformation/cocoon #14

Error during date column casting (v0.1.153)

Hi, I am getting an error in the test colab notebook (v0.1.153): https://colab.research.google.com/github/Cocoon-Data-Transformation/cocoon/blob/main/demo/Cocoon_Stage_Demo.ipynb I did the dat…

Digma updated 4 days ago

hackforla/lucky-parking #469

Create data cleaning pipeline in GCP

### Overview We need to create a data cleaning pipeline that takes in raw input data from the Socrata API and updates the Google Cloud Platform database with the correctly formatted geospatial data. …

gregpawin updated 1 year ago

hackforla/lucky-parking #149

Create data cleaning pipeline in AWS

### Overview We need to create a data cleaning pipeline that takes in raw input data from the Socrata API and updates the AWS database with the correctly formatted geospatial data ### Action items…

gregpawin updated 1 year ago

polar-computing/AerosolDelta #4

Phase 2: Data Retrieval and Cleaning Pipeline

## **Pipeline** 1. Download 1. Randolph Glacier Inventory (RGI) 5.0 Complete 2. MERRA-2 Aerosol Raster Modeled Data 3. CALIOP Aerosol Raster observation Version 3 Aerosol Profile Data 4. …

karanjeets updated 8 years ago

NYCPlanning/data-engineering #484

Data Library Overhaul

## Motivations Data library is a great starting point for the "extract" portion of dcpy, but there are multiple ways its not meeting our needs. Our main area of focus is data quality, both on the p…

fvankrieken updated 2 weeks ago

1000+ results
for data-cleaning-pipeline